Loading…
Cloud Tech III has ended

Cloud Tech is the largest gathering of cloud technologists & engineers in the bay area. Our speakers include the top cloud computing entrepreneurs & experts.

Come join us Saturday, October 6th, from 9am to 6pm at the Computer History Museum in Mountain View, CA for a full 8 hours of learning directly from great minds sharing their secrets!

Register Here

Special thanks to our sponsors who made this all possible. They are: CloudStack, Scalr, VMware, Rackspace, HP, DataStax, AWS, Canonical, Puppet, and General Catalyst.

View analytic
Saturday, October 6 • 5:00pm - 5:45pm
Running the Largest HDFS Cluster

Sign up or log in to save this to your schedule and see who's attending!

HDFS is a highly scalable fault-tolerant distributed file system designedfor running on low-cost commodity hardware. HDFS creates multiple replicas of data blocks and distributes them on compute nodes throughout a cluster to enable fast computation on large data sets.


Facebook uses HDFS as a data warehouse, storing Hive tables that collect all Facebook user behaviors from the Facebook's front pages. The warehouse HDFS cluster is consisted of more than thousands of nodes, configured with over 100PB of storage space. Currently the cluster stores hundreds million files, with a growth rate of around hundreds TB of physical space each day. Meanwhile the cluster services a huge load of I/O requests.


The Facebook warehouse HDFS cluster by far is the largest HDFS in the world in term of its capacity. Keeping such a huge file system up andrunning quickly and reliably gives exciting and interesting challenges tothe HDFS team. This talk will present more information on the scale of thecluster, the problems we face, and the solutions we come up to improve theavailability, the scale, and the storage efficiency of the cluster.


Speakers
avatar for Hairong Kuang

Hairong Kuang

Software Engineer, Hadoop committer, Facebook, Apache.org
Hairong Kuang currently leads the development of Hadoop Distributed FileSystem (HDFS) at Facebook. She has been a long time contributor andcommitter to the Apache Hadoop project since she joined Yahoo!. Prior toindustry, she was an Assistant Professor at California State Polytech... Read More →


Saturday October 6, 2012 5:00pm - 5:45pm
Hahn Auditorium Computer History Museum

Attendees (16)