Loading…
This event has ended. View the official site or create your own event → Check it out
This event has ended. Create your own

Cloud Tech is the largest gathering of cloud technologists & engineers in the bay area. Our speakers include the top cloud computing entrepreneurs & experts.

Come join us Saturday, October 6th, from 9am to 6pm at the Computer History Museum in Mountain View, CA for a full 8 hours of learning directly from great minds sharing their secrets!

Register Here

Special thanks to our sponsors who made this all possible. They are: CloudStack, Scalr, VMware, Rackspace, HP, DataStax, AWS, Canonical, Puppet, and General Catalyst.

View analytic
Saturday, October 6 • 4:00pm - 4:45pm
A Survey of Petabyte Scale Storage Systems Deployed at Facebook

Sign up or log in to save this to your schedule and see who's attending!

At Facebook, we use various types of databases and storage system to satisfy the needs of different applications. The solutions built around these data store systems have a common set of requirements: they have to be highly scalable, maintainence costs should be low and they have to perform efficiently. We use a sharded mySQL+memcache solution to support realtime access of tens of petabytes of data and we use TAO to provide consistency of this web-scale database across geographical distances. We use Haystack datastore for storing the 3 billion new photos we host every week. We use Apache Hadoop to mine intelligence from 100 petabytes of clicklogs and combine it with the power of Apache HBase to store all Facebook Messages.

This talk describes the reasons why each of these databases are appropriate for their workloads and the design decisions and tradeoffs that were made while implementing these solutions. We touch upon the consistency, availability and partitioning tolerance of each of these solutions. We touch upon the reasons why some of these systems need ACID semantics and other systems do not. We describe the evolution of our mySQL databases to a pure SSD based deployment.


Speakers
avatar for Dhruba Borthakur

Dhruba Borthakur

Software Engineer, Facebook
Dhruba Borthakur is an engineer in the Database Engineering Team at Facebook. He has been one of the early committers for the Apache Hadoop Distributed File System. He has been associated with Hadoop since its infancy while working for Yahoo. He has been instrumental in scaling Facebook's Hadoop cluster to multiples of petabytes. Dhruba is a contributor to the open source Apache HBase project. Dhruba's current focus is to evaluate alternative database technologies that can provide optimal efficiency on Solid State Devices. | Earlier, he was a Senior Lead Engineer at Veritas Software (since acquired by Symantec) and was responsible for the development of the Veritas SanPointDirect Storage System. Prior to Veritas, he was the Chief Architect at Oreceipt.com, an e-commerce startup based in Sunnyvale. Before that, he was a Senior Engineer at IBM-Transarc Labs where he contributed to the development of Andrew File System (AFS), a part of IBM's e-commerce initiative, WebSphere. Dhruba has an M.S. in Computer Science from the University of Wisconsin, Madison and a B.S. in Computer Science BITS, Pilani, India. He has 20 issued patents. He is the co-author of multiple academic papers in respected conferences like SIGMOD, NSDI and EuroSys. He hosts a Hadoop blog... Read More →


Saturday October 6, 2012 4:00pm - 4:45pm
Hahn Auditorium Computer History Museum

Attendees (20)