Facebook datacenter "secrets"

Facebook Open Sources Its Servers and Data Centers. Facebook has shared many details of its new server and data center design on Building Efficient Data Centers with the Open Compute Project article and project. Open Compute Project effort will bring this web scale computing to the masses. The new data center is designed for AMD and Intel and the x86 architecture.


You might ask Why Facebook open-sourced its datacenters? The answer is that Facebook has opened up a whole new front in its war with Google over top technical talent and ad dollars. “By releasing Open Compute Project technologies as open hardware,” Facebook writes, “our goal is to develop servers and data centers following the model traditionally associated with open source software projects. Our first step is releasing the specifications and mechanical drawings. The second step is working with the community to improve them.”


By the by this data center approach has some similarities to Google data center designs, at least to details they have published. Despite Google’s professed love for all things open, details of its massive data centers have always been a closely guarded secret. Google usually talks about its servers once they’re obsolete.

Open Compute Project is not the first open source server hardware project. How to build cheap cloud storage article shows another interesting project.


  1. Tomi Engdahl says:
    Emerson, Facebook team to design ‘rapid deployment data center’

    Emerson Network Power (NYSE: EMR) announced that it is working with Facebook to design and deploy the company’s second data center building in Luleå, Sweden. According to a press release, the “Luleå 2” facility will be the pilot for Facebook’s new “rapid deployment data center (RDDC)”, which was designed and developed in collaboration with Emerson Network Power’s data center design team.

    The Luleå 2 facility will span approximately 125,000 sq. ft. and Emerson will deliver over 250 shippable modules, including power skids, evaporative air handlers, a water treatment plant, and data center superstructure solutions. It will be built next to Facebook’s first data center building in Luleå, which came online in June 2013.

    “Because of our relentless focus on efficiency, we are always looking for ways to optimize our data centers including accelerating build times and reducing material use,”

  2. Pleasanton legal advisor says:
    Thanks for finally writing about >Facebook datacenter
    “secrets” | <Liked it!
  3. how to do affiliate marketing says:
    whoah this weblog is wonderful i love studying your articles.
    Keep up the great work! You know, a lot of persons
    are searching around for this info, you could aid them greatly.
  4. Tomi Engdahl says:
    Facebook Acquires Security Startup PrivateCore to Better Protect Its Data Centers

    Facebook announced on Thursday that it has acquired PrivateCore, an online security startup specifically focused on server security. Terms of the deal were not disclosed.

    The two-year-old startup will help Facebook keep its massive data centers safe from malware attacks and other forms of security breaches.

  5. Tomi Engdahl says:
    White Paper Download: Thermal Efficiency: Facebook’s Datacenter Server Design – EEdge Vol 3 Article
  6. Tomi Engdahl says:
    Facebook unveils Autoscale, its load-balancing system that achieves an average power saving of 10-15%

    acebook today revealed details about Autoscale, a system for power-efficient load balancing that has been rolled out to production clusters in its data centers. The company says it has “demonstrated significant energy savings.”

    For those who don’t know, load balancing refers to distributing workloads across multiple computing resources, in this case servers. The goal is to optimize resource use, which can mean different things depending on the task at hand.

    The control loop starts with collecting utilization information (CPU, request queue, and so on) from all active servers. The Autoscale controller then decides on the optimal active pool size and passes the decision to the load balancers, which distribute the workload evenly.

  7. Tomi Engdahl says:
    Startup Sees Enterprise Op for TLC NAND

    In some uses cases, Fife says, the company will employ lower-cost TLC NAND, particularly for what has been dubbed cold storage of data, and that the company’s variable code rate LDPC-based error-correcting code (ECC) memory can address endurance concerns. However, he believes, multi-level cell (MLC) is still the best option for hyperscale applications.

    Social networking giant Facebook has been vocal about wanting a low-cost flash technology, saying at last year’s Flash Summit that a relatively low-endurance, poor-performance chip would better serve its need to store some 350 million new photos a day. Not long after, Jim Handy, principal analyst at Objective Analysis, concluded that Facebook would have to settle for a hierarchy of DRAM-flash-HDD for the foreseeable future. TLC might be cheaper and viable for cold storage, but not as cheap as Facebook would like, he said.

  8. Tomi Engdahl says:
    Facebook Experimenting With Blu-ray As a Storage Medium

    The discs are held in groups of 12 in locked cartridges and are extracted by a robotic arm whenever they’re needed. One rack contains 10,000 discs, and is capable of storing a petabyte of data, or one million gigabytes. Blu-ray discs offer a number of advantages versus hard drives. For one thing, the discs are more resilient: they’re water- and dust-resistant, and better able to withstand temperature swings.

  9. Tomi Engdahl says:
    Facebook decided it couldn’t wait for companies like Arista to come out with new switches, so it will build its own. The Wedge switch (above), already being tested in production networks, will become a design Facebook will contribute to its Open Compute Project, an open-source hardware initiative.

    “We wanted to get agility because we are changing our requirements in a three-month cycle,” far faster than vendors like Arista and Broadcom can field new products, said Yuval Bachar, a former Cisco engineering manager, now working at Facebook.

    The company’s datacenters are approaching a million-server milestone, Bachar said. Today it uses 10 Gbit/s links from its top-of-rack servers, but it will need to upgrade in six to eight months, he said. The Wedge sports up to 32 40G ports.

    The most interesting thing about Wedge is its use of a small server card, currently using an x86 SoC. However it could be replaced with an ARM SoC or “other programmable elements,” Bachar said.

    Source: http://www.eetimes.com/document.asp?doc_id=1323695&page_number=2

  10. Tomi Engdahl says:
    Facebook, the security company
    CSO Joe Sullivan talks about PrivateCore and Facebook’s homegrown security clout.

    A VM in a vCage

    The technology PrivateCore is developing, vCage, is a virtual “cage” in the telecom industry’s usage of the word. It is software that is intended to continuously assure that the servers it protects have not had their software tampered with or been exploited by malware. It also prevents physical access to the data running on the server, just as a locked cage in a colocation facility would.

    The software integrates with OpenStack private cloud infrastructure to continuously monitor virtual machines, encrypt what’s stored in memory, and provide additional layers of security to reduce the probability of an outside attacker gaining access to virtual servers through malware or exploits of their Web servers and operating systems. If the “attestation” system detects a change that would indicate that a server has been exploited, it shuts it down and re-provisions another server elsewhere. Sullivan explained that the technology is seen as key to Facebook’s strategy for Internet.org because it will allow the company to put servers in places outside the highly secure (and expensive) data centers it operates in developed countries.

    “We’re trying to get a billion more people on the Internet,” he said. “So we have to have servers closer to where they are.”

    By purchasing PrivateCore, Facebook is essentially taking vCage off the market. The software “is not going to be sold,” Sullivan said. “They had a couple of public customers and a couple of private ones. But they took the opportunity to get to work with us because it will develop their technology faster.”

    Sullivan said the software would not be for sale for the foreseeable future. “The short-term goal is to get it working in one or two test-beds,“

    It’s been 18 months since Facebook was hit by a Java zero-day that compromised a developer’s laptop. Since then, Facebook has done a lot to reduce the potential for attacks and is using the same anomaly detection technology the company developed to watch for fraudulent Facebook user logins to spot problems within its own network and facilities.

    The Java zero-day, he said, “drove home that it’s impossible to secure an employee’s computer 100 percent.” To minimize what an attacker can get to, Facebook has moved virtually everything that employees work with into its own cloud—reducing the amount of sensitive data that resides on individual employees’ computers as much as possible.

  11. Tomi Engdahl says:
    Facebook’s Newest Data Center Is Now Online In Altoona, Iowa

    Facebook today announced that its newest data center in Altoona, Iowa, is now open for business. The new facility complements the company’s other centers in Prineville, Ore; Forest City, N.C. and Luleå, Sweden (the company also operates out of a number of smaller shared locations). This is the first of two data centers the company is building at this site in Altoona.

    What’s actually more interesting than the fact that the new location is now online is that it’s also the first data center to use Facebook’s new high-performance networking architecture.

    With Facebook’s new approach, however, the entire data center runs on a single high-performance network. There are no clusters, just server pods that are all connected to each other. Each pod has 48 server racks — that’s much smaller than Facebook’s old clusters — and all of those pods are then connected to the larger network.

  12. Tomi Engdahl says:
    Facebook’s New Data Center Is Bad News for Cisco

    Facebook is now serving the American heartland from a data center in the tiny town of Altoona, Iowa. Christened on Friday morning, this is just one of the many massive computing facilities that deliver the social network to phones, tablets, laptops, and desktop PCs across the globe, but it’s a little different from the rest.

    As it announced that the Altoona data center is now serving traffic to some of its 1.35 billion users, the company also revealed how its engineers pieced together the computer network that moves all that digital information through the facility. The rather complicated arrangement shows, in stark fashion, that the largest internet companies are now constructing their computer networks in very different ways—ways that don’t require expensive networking gear from the likes of Cisco and Juniper, the hardware giants that played such a large role when the foundations of the net were laid.

    From the Old to the New

    Traditionally, when companies built computer networks to run their online operations, they built them in tiers. They would create a huge network “core” using enormously expensive and powerful networking gear. Then a smaller tier—able to move less data—would connect to this core. A still smaller tier would connect to that. And so on—until the network reached the computer servers that were actually housing the software people wanted to use.

    For the most part, the hardware that ran these many tiers—from the smaller “top-of-rack” switches that drove the racks of computer servers, to the massive switches in the backbone—were provided by hardware giants like Cisco and Juniper. But in recent years, this has started to change. Many under-the-radar Asian operations and other networking vendors now provide less expensive top-of-rack switches, and in an effort to further reduce costs and find better ways of designing and managing their networks, internet behemoths such as Google and Facebook are now designing their own top-of-racks switches.

    This is well documented. But that’s not all that’s happening. The internet giants are also moving to cheaper gear at the heart of their massive networks. That’s what Facebook has done inside its Altoona data center. In essence, it has abandoned the hierarchical model, moving away from the enormously expensive networking gear that used to drive the core of its networks.

  13. Tomi Engdahl says:
    Facebook Weaves New Fabric
    Smaller switches are better, says datacenter.

    Facebook’s new datacenter marks the latest effort to design a warehouse-sized system as a single network. The effort suggests big datacenters may switch to using more smaller, cheaper aggregation switches rather than relying on –and being limited by– the biggest, fastest boxes they can purchase.

    The company described the fabric architecture of its new Altoona, Iowa, datacenter in a Web post. It said the datacenter uses 10G networking to servers and 40G between all top-of-rack and aggregation switches.

    The news comes just weeks after rival Microsoft announced it is starting to migrate all its servers to 40G links and switches to 100G. Microsoft suggested it might use FPGAs on future systems to extend bandwidth in the future given it is surpassing what current and expected Ethernet chips will deliver.

    Big datacenters have long been pushing the edge of networking which is their chief bottleneck. The new Facebook datacenter appears to try to solve the problem using a novel topology, rather than using more expensive hardware.

    Chip and systems vendors hurriedly developed efforts for 25G Ethernet earlier this year as another approach for bandwidth-starved datacenters. They hope some datacenters migrate from 10 to 25G to the server with road maps to 50 and possibly 200G for switches.

    Facebook suggested its approach opens up more bandwidth and provides and easier way to scale networks while still tolerating expected component and system failures. It said its 40G fabric could quickly scale to 100G for which chips and systems are now available although rather expensive.

    Facebook said its new design provides 10x more bandwidth between servers inside the datacenter where traffic growth rates are highest. It said it could tune the approach to a 50x bandwidth increase using the same 10/40G links. The fabric operates at Layer 3 using BGP4 as its only routing protocol with minimal features enabled.

    “Our current starting point is 4:1 fabric oversubscription from rack to rack, with only 12 spines per plane, out of 48 possible.”

  14. Tomi Engdahl says:
    Network lifecycle management and the Open OS

    The bare-metal switch ecosystem and standards are maturing, driven by the Open Compute Project.

    For decades, lifecycle management for network equipment was a laborious, error-prone process because command-line interfaces (CLIs) were the only way to configure equipment. Open operating systems and the growing Linux community have now streamlined this process for servers, and the same is beginning to happen for network switches.

    Network lifecycle management involves three phases: on-boarding or provisioning, production, and decommissioning. The state of network equipment is continually in flux as applications are deployed or removed, so network administrators must find ways to configure and manage equipment efficiently and cost-effectively.

    In the server world, the emergence of Linux based operating systems have revolutionized server on-boarding and provisioning. Rather than using a CLI to configure servers one at a time, system administrators can use automation tools like Chef and Puppet to store and apply configurations with the click of a mouse. For example, suppose an administrator wants to commission four Hadoop servers. Rather than using a CLI to provision each of them separately, the administrator can instruct a technician to click on the Hadoop library in Chef and provision the four servers automatically. This saves time and eliminates the potential for configuration errors due to missed keystrokes, or calling up an old driver.

    This kind of automated provisioning has been a godsend to network administrators and is fast becoming the standard method of lifecycle management for servers. But what about switches?

    Network administrators would like to use the same methodology for switches in their networks, but the historical nature of switches has held them back.

    Traditionally, network switches have been proprietary devices with proprietary operating systems. Technicians must use a CLI or the manufacturer’s own tools to provision a switch.

    Using a CLI for lots of repetitive tasks can lead to errors and lost productivity from repeating the same mundane tasks over and over again.

    Today, three manufacturers (Big Switch, Cumulus, and Pica8) are offering Linux-based OSs for bare-metal switches that allows these switches to be provisioned with standard, Linux tools.

    Application-programming interfaces (APIs) like JSON or RESTful interfaces that interact with the operating system CLI are becoming more common. APIs help make a second parallel between server and network life cycle thinking. Open APIs give developers a common framework to integrate with home grown and off the shelf management, operations, provisioning and accounting tools. Chef and Puppet are becoming common tools on the server side that also extend functionality for networking. Linux-based network OSs are open and offer the ability to run applications like Puppet in user space, simply typing “apt get install puppet” runs them natively on the switch itself.

    The three phases of network lifecycle management: on-boarding or provisioning, production, and decommissioning all benefit from this combination of CLI, Linux, and open APIs. Tools around Linux help build the base of the stack, getting Linux onto the bare metal through even more fundamental tools like zero touch provisioning. A custom script using a JSON API might poll the switch OS for accounting data while in production. And lastly, Puppet could be used to push a new configuration to the switch, in effect decommissioning the previous application in this case.


Leave a Comment

Your email address will not be published. Required fields are marked *