Google’s Computing Infrastructure
System
n~ 3 million processors in clusters of ~2000 processors each
nCommodity parts
lx86 processors, IDE disks, Ethernet communications
lGain reliability through redundancy & software management
nPartitioned workload
lData: Web pages, indices distributed across processors
lFunction: crawling, index generation, index search, document retrieval, Ad placement
A Data-Intensive Scalable Computer (DISC)
nLarge-scale computer centered around data
lCollecting, maintaining, indexing, computing
nSimilar systems at Microsoft & Yahoo
Barroso, Dean, Hölzle, “Web Search for a Planet: The Google Cluster Architecture” IEEE Micro 2003
[Randal Bryant, CMU]