System
n~ 3 million
processors in clusters of ~2000 processors each
nCommodity
parts
lx86 processors, IDE disks, Ethernet communications
lGain reliability through redundancy & software management
nPartitioned
workload
lData: Web pages, indices distributed across processors
lFunction: crawling, index generation, index search, document retrieval, Ad
placement
A
Data-Intensive Scalable Computer (DISC)
nLarge-scale
computer centered around data
lCollecting, maintaining, indexing, computing
nSimilar systems
at Microsoft & Yahoo
Barroso, Dean, Hölzle, “Web Search for a Planet: The
Google Cluster Architecture” IEEE Micro 2003