Google's Hyper-Secure Data Centers

Here are some article pointers that reveal some details of Google data centers:

What It’s Like At Google’s Hyper-Secure Data Centers

Super-Secret Google Builds Servers in the Dark

Google cools data center with bathtubs, dishwashers …

2 Comments

  1. Tomi Engdahl says:

    Google loses data as lightning strikes
    http://www.bbc.com/news/technology-33989384

    Google says data has been wiped from discs at one of its data centres in Belgium – after it was struck by lightning four times.

    Some people have permanently lost access to their files as a result.

    A number of disks damaged following the lightning strikes did, however, later became accessible.

    Generally, data centres require more lightning protection than most other buildings.

    While four successive strikes might sound highly unlikely, lightning does not need to repeatedly strike a building in exactly the same spot to cause additional damage.

    Justin Gale, project manager for the lightning protection service Orion, said lightning could strike power or telecommunications cables connected to a building at a distance and still cause disruptions.

    “The cabling alone can be struck anything up to a kilometre away, bring [the shock] back to the data centre and fuse everything that’s in it,” he said.

    In an online statement, Google said that data on just 0.000001% of disk space was permanently affected.

    “Although automatic auxiliary systems restored power quickly, and the storage systems are designed with battery backup, some recently written data was located on storage systems which were more susceptible to power failure from extended or repeated battery drain,” it said.

    The company added it would continue to upgrade hardware and improve its response procedures to make future losses less likely.

    Google Compute Engine Incident #15056
    https://status.cloud.google.com/incident/compute/15056#5719570367119360

    Google Compute Engine Persistent Disk issue in europe-west1-b

    From Thursday 13 August 2015 to Monday 17 August 2015, errors occurred on a small proportion of Google Compute Engine persistent disks in the europe-west1-b zone. The affected disks sporadically returned I/O errors to their attached GCE instances, and also typically returned errors for management operations such as snapshot creation. In a very small fraction of cases (less than 0.000001% of PD space in europe-west1-b), there was permanent data loss.

    ROOT CAUSE:

    At 09:19 PDT on Thursday 13 August 2015, four successive lightning strikes on the local utilities grid that powers our European datacenter caused a brief loss of power to storage systems which host disk capacity for GCE instances in the europe-west1-b zone. Although automatic auxiliary systems restored power quickly, and the storage systems are designed with battery backup, some recently written data was located on storage systems which were more susceptible to power failure from extended or repeated battery drain. In almost all cases the data was successfully committed to stable storage, although manual intervention was required in order to restore the systems to their normal serving state. However, in a very few cases, recent writes were unrecoverable, leading to permanent data loss on the Persistent Disk.

    This outage is wholly Google’s responsibility.

    Reply
  2. Tomi Engdahl says:

    Google Loses Data: Who Says Lightning Never Strikes Twice?
    http://www.eetimes.com/document.asp?doc_id=1327474&

    oogle experienced high read/write error rates and a small data loss at its Google Compute Engine data center in Ghislain, Belgium, Aug. 13-17 following a storm that delivered four lightning strikes on or near the data center.

    Data centers, like other commercial buildings, can be protected from lightning, and Google offered no details as to how its persistent-state disk equipment had been affected by the strikes, other than to say they caused power supply lapses. Emergency power kicked in as planned, but in some cases the battery backup to the disk systems did not perform as expected.
    Sponsor video, mouseover for sound

    According to a summary of the incident by the Google cloud operations team posted to its Google Cloud Status page: “Although automatic auxiliary systems restored power quickly, and the storage systems are designed with battery backup, some recently written data was located on storage systems which were more susceptible to power failure from extended or repeated battery drain.”

    The Cloud Status summary doesn’t say whether the repeated strikes led to multiple failures of the power supply to the disks.

    The summary also did not say the data center was struck four times, as a BBC report on the incident noted. Rather, Google said only that there were “four successive strikes on the electrical systems of a European data center.”

    Google Loses Data: Who Says Lightning Never Strikes Twice?
    http://www.informationweek.com/cloud/cloud-storage/google-loses-data-who-says-lightning-never-strikes-twice-/d/d-id/1321836

    In a four-strike incident, power to Google Compute Cloud disks in Ghislain, Belgium, gets interupted and data writes are lost.

    Google experienced high read/write error rates and a small data loss at its Google Compute Engine data center in Ghislain, Belgium, Aug. 13-17 following a storm that delivered four lightning strikes on or near the data center.

    According to a summary of the incident by the Google cloud operations team posted to its Google Cloud Status page: “Although automatic auxiliary systems restored power quickly, and the storage systems are designed with battery backup, some recently written data was located on storage systems which were more susceptible to power failure from extended or repeated battery drain.”

    The Cloud Status summary doesn’t say whether the repeated strikes led to multiple failures of the power supply to the disks.

    In Google’s situation, its summary report said: “In almost all cases the data was successfully committed to stable storage, although manual intervention was required in order to restore the systems to their normal serving state. However, in a very few cases, recent writes were unrecoverable, leading to permanent data loss on the Persistent Disk.”

    Any loss of data is a serious incident for a cloud service provider, and they take extraordinary measures to prevent it. Data sets are routinely copied three times, so that a hardware failure will still leave two intact copies. But the power interruption in Ghislain caused some data writes to disk to be lost, and it was those write incidents that created the lost data.

    As a way of minimizing the loss, the Google summary cited a statistic that represented the amount of persistent disk space that had been affected out of the total available in Ghislain — “less than 0.000001%.” That was a meaningless figure to those customers who happened to be doing frequent read/writes with their systems at the time. A more meaningful figure would have been simply the total amount of data lost in kilobytes, megabytes, or terabytes or the percentage of writes lost.

    Google loses data as lightning strikes
    http://www.bbc.com/news/technology-33989384

    Reply

Leave a Comment

Your email address will not be published. Required fields are marked *

*

*