Facebook datacenter "secrets"

Facebook Open Sources Its Servers and Data Centers. Facebook has shared many details of its new server and data center design on Building Efficient Data Centers with the Open Compute Project article and project. Open Compute Project effort will bring this web scale computing to the masses. The new data center is designed for AMD and Intel and the x86 architecture.


You might ask Why Facebook open-sourced its datacenters? The answer is that Facebook has opened up a whole new front in its war with Google over top technical talent and ad dollars. “By releasing Open Compute Project technologies as open hardware,” Facebook writes, “our goal is to develop servers and data centers following the model traditionally associated with open source software projects. Our first step is releasing the specifications and mechanical drawings. The second step is working with the community to improve them.”


By the by this data center approach has some similarities to Google data center designs, at least to details they have published. Despite Google’s professed love for all things open, details of its massive data centers have always been a closely guarded secret. Google usually talks about its servers once they’re obsolete.

Open Compute Project is not the first open source server hardware project. How to build cheap cloud storage article shows another interesting project.


  1. Tomi Engdahl says:
    Emerson, Facebook team to design ‘rapid deployment data center’

    Emerson Network Power (NYSE: EMR) announced that it is working with Facebook to design and deploy the company’s second data center building in Luleå, Sweden. According to a press release, the “Luleå 2” facility will be the pilot for Facebook’s new “rapid deployment data center (RDDC)”, which was designed and developed in collaboration with Emerson Network Power’s data center design team.

    The Luleå 2 facility will span approximately 125,000 sq. ft. and Emerson will deliver over 250 shippable modules, including power skids, evaporative air handlers, a water treatment plant, and data center superstructure solutions. It will be built next to Facebook’s first data center building in Luleå, which came online in June 2013.

    “Because of our relentless focus on efficiency, we are always looking for ways to optimize our data centers including accelerating build times and reducing material use,”

  2. Pleasanton legal advisor says:
    Thanks for finally writing about >Facebook datacenter
    “secrets” | <Liked it!
  3. how to do affiliate marketing says:
    whoah this weblog is wonderful i love studying your articles.
    Keep up the great work! You know, a lot of persons
    are searching around for this info, you could aid them greatly.
  4. Tomi Engdahl says:
    Facebook Acquires Security Startup PrivateCore to Better Protect Its Data Centers

    Facebook announced on Thursday that it has acquired PrivateCore, an online security startup specifically focused on server security. Terms of the deal were not disclosed.

    The two-year-old startup will help Facebook keep its massive data centers safe from malware attacks and other forms of security breaches.

  5. Tomi Engdahl says:
    White Paper Download: Thermal Efficiency: Facebook’s Datacenter Server Design – EEdge Vol 3 Article
  6. Tomi Engdahl says:
    Facebook unveils Autoscale, its load-balancing system that achieves an average power saving of 10-15%

    acebook today revealed details about Autoscale, a system for power-efficient load balancing that has been rolled out to production clusters in its data centers. The company says it has “demonstrated significant energy savings.”

    For those who don’t know, load balancing refers to distributing workloads across multiple computing resources, in this case servers. The goal is to optimize resource use, which can mean different things depending on the task at hand.

    The control loop starts with collecting utilization information (CPU, request queue, and so on) from all active servers. The Autoscale controller then decides on the optimal active pool size and passes the decision to the load balancers, which distribute the workload evenly.

  7. Tomi Engdahl says:
    Startup Sees Enterprise Op for TLC NAND

    In some uses cases, Fife says, the company will employ lower-cost TLC NAND, particularly for what has been dubbed cold storage of data, and that the company’s variable code rate LDPC-based error-correcting code (ECC) memory can address endurance concerns. However, he believes, multi-level cell (MLC) is still the best option for hyperscale applications.

    Social networking giant Facebook has been vocal about wanting a low-cost flash technology, saying at last year’s Flash Summit that a relatively low-endurance, poor-performance chip would better serve its need to store some 350 million new photos a day. Not long after, Jim Handy, principal analyst at Objective Analysis, concluded that Facebook would have to settle for a hierarchy of DRAM-flash-HDD for the foreseeable future. TLC might be cheaper and viable for cold storage, but not as cheap as Facebook would like, he said.

  8. Tomi Engdahl says:
    Facebook Experimenting With Blu-ray As a Storage Medium

    The discs are held in groups of 12 in locked cartridges and are extracted by a robotic arm whenever they’re needed. One rack contains 10,000 discs, and is capable of storing a petabyte of data, or one million gigabytes. Blu-ray discs offer a number of advantages versus hard drives. For one thing, the discs are more resilient: they’re water- and dust-resistant, and better able to withstand temperature swings.

  9. Tomi Engdahl says:
    Facebook decided it couldn’t wait for companies like Arista to come out with new switches, so it will build its own. The Wedge switch (above), already being tested in production networks, will become a design Facebook will contribute to its Open Compute Project, an open-source hardware initiative.

    “We wanted to get agility because we are changing our requirements in a three-month cycle,” far faster than vendors like Arista and Broadcom can field new products, said Yuval Bachar, a former Cisco engineering manager, now working at Facebook.

    The company’s datacenters are approaching a million-server milestone, Bachar said. Today it uses 10 Gbit/s links from its top-of-rack servers, but it will need to upgrade in six to eight months, he said. The Wedge sports up to 32 40G ports.

    The most interesting thing about Wedge is its use of a small server card, currently using an x86 SoC. However it could be replaced with an ARM SoC or “other programmable elements,” Bachar said.

    Source: http://www.eetimes.com/document.asp?doc_id=1323695&page_number=2

  10. Tomi Engdahl says:
    Facebook, the security company
    CSO Joe Sullivan talks about PrivateCore and Facebook’s homegrown security clout.

    A VM in a vCage

    The technology PrivateCore is developing, vCage, is a virtual “cage” in the telecom industry’s usage of the word. It is software that is intended to continuously assure that the servers it protects have not had their software tampered with or been exploited by malware. It also prevents physical access to the data running on the server, just as a locked cage in a colocation facility would.

    The software integrates with OpenStack private cloud infrastructure to continuously monitor virtual machines, encrypt what’s stored in memory, and provide additional layers of security to reduce the probability of an outside attacker gaining access to virtual servers through malware or exploits of their Web servers and operating systems. If the “attestation” system detects a change that would indicate that a server has been exploited, it shuts it down and re-provisions another server elsewhere. Sullivan explained that the technology is seen as key to Facebook’s strategy for Internet.org because it will allow the company to put servers in places outside the highly secure (and expensive) data centers it operates in developed countries.

    “We’re trying to get a billion more people on the Internet,” he said. “So we have to have servers closer to where they are.”

    By purchasing PrivateCore, Facebook is essentially taking vCage off the market. The software “is not going to be sold,” Sullivan said. “They had a couple of public customers and a couple of private ones. But they took the opportunity to get to work with us because it will develop their technology faster.”

    Sullivan said the software would not be for sale for the foreseeable future. “The short-term goal is to get it working in one or two test-beds,“

    It’s been 18 months since Facebook was hit by a Java zero-day that compromised a developer’s laptop. Since then, Facebook has done a lot to reduce the potential for attacks and is using the same anomaly detection technology the company developed to watch for fraudulent Facebook user logins to spot problems within its own network and facilities.

    The Java zero-day, he said, “drove home that it’s impossible to secure an employee’s computer 100 percent.” To minimize what an attacker can get to, Facebook has moved virtually everything that employees work with into its own cloud—reducing the amount of sensitive data that resides on individual employees’ computers as much as possible.

  11. Tomi Engdahl says:
    Facebook’s Newest Data Center Is Now Online In Altoona, Iowa

    Facebook today announced that its newest data center in Altoona, Iowa, is now open for business. The new facility complements the company’s other centers in Prineville, Ore; Forest City, N.C. and Luleå, Sweden (the company also operates out of a number of smaller shared locations). This is the first of two data centers the company is building at this site in Altoona.

    What’s actually more interesting than the fact that the new location is now online is that it’s also the first data center to use Facebook’s new high-performance networking architecture.

    With Facebook’s new approach, however, the entire data center runs on a single high-performance network. There are no clusters, just server pods that are all connected to each other. Each pod has 48 server racks — that’s much smaller than Facebook’s old clusters — and all of those pods are then connected to the larger network.

  12. Tomi Engdahl says:
    Facebook’s New Data Center Is Bad News for Cisco

    Facebook is now serving the American heartland from a data center in the tiny town of Altoona, Iowa. Christened on Friday morning, this is just one of the many massive computing facilities that deliver the social network to phones, tablets, laptops, and desktop PCs across the globe, but it’s a little different from the rest.

    As it announced that the Altoona data center is now serving traffic to some of its 1.35 billion users, the company also revealed how its engineers pieced together the computer network that moves all that digital information through the facility. The rather complicated arrangement shows, in stark fashion, that the largest internet companies are now constructing their computer networks in very different ways—ways that don’t require expensive networking gear from the likes of Cisco and Juniper, the hardware giants that played such a large role when the foundations of the net were laid.

    From the Old to the New

    Traditionally, when companies built computer networks to run their online operations, they built them in tiers. They would create a huge network “core” using enormously expensive and powerful networking gear. Then a smaller tier—able to move less data—would connect to this core. A still smaller tier would connect to that. And so on—until the network reached the computer servers that were actually housing the software people wanted to use.

    For the most part, the hardware that ran these many tiers—from the smaller “top-of-rack” switches that drove the racks of computer servers, to the massive switches in the backbone—were provided by hardware giants like Cisco and Juniper. But in recent years, this has started to change. Many under-the-radar Asian operations and other networking vendors now provide less expensive top-of-rack switches, and in an effort to further reduce costs and find better ways of designing and managing their networks, internet behemoths such as Google and Facebook are now designing their own top-of-racks switches.

    This is well documented. But that’s not all that’s happening. The internet giants are also moving to cheaper gear at the heart of their massive networks. That’s what Facebook has done inside its Altoona data center. In essence, it has abandoned the hierarchical model, moving away from the enormously expensive networking gear that used to drive the core of its networks.

  13. Tomi Engdahl says:
    Facebook Weaves New Fabric
    Smaller switches are better, says datacenter.

    Facebook’s new datacenter marks the latest effort to design a warehouse-sized system as a single network. The effort suggests big datacenters may switch to using more smaller, cheaper aggregation switches rather than relying on –and being limited by– the biggest, fastest boxes they can purchase.

    The company described the fabric architecture of its new Altoona, Iowa, datacenter in a Web post. It said the datacenter uses 10G networking to servers and 40G between all top-of-rack and aggregation switches.

    The news comes just weeks after rival Microsoft announced it is starting to migrate all its servers to 40G links and switches to 100G. Microsoft suggested it might use FPGAs on future systems to extend bandwidth in the future given it is surpassing what current and expected Ethernet chips will deliver.

    Big datacenters have long been pushing the edge of networking which is their chief bottleneck. The new Facebook datacenter appears to try to solve the problem using a novel topology, rather than using more expensive hardware.

    Chip and systems vendors hurriedly developed efforts for 25G Ethernet earlier this year as another approach for bandwidth-starved datacenters. They hope some datacenters migrate from 10 to 25G to the server with road maps to 50 and possibly 200G for switches.

    Facebook suggested its approach opens up more bandwidth and provides and easier way to scale networks while still tolerating expected component and system failures. It said its 40G fabric could quickly scale to 100G for which chips and systems are now available although rather expensive.

    Facebook said its new design provides 10x more bandwidth between servers inside the datacenter where traffic growth rates are highest. It said it could tune the approach to a 50x bandwidth increase using the same 10/40G links. The fabric operates at Layer 3 using BGP4 as its only routing protocol with minimal features enabled.

    “Our current starting point is 4:1 fabric oversubscription from rack to rack, with only 12 spines per plane, out of 48 possible.”


Leave a Comment

Your email address will not be published. Required fields are marked *