[geeks] Fwd: [IP] Interesting speculation on the tech behind gmail

Tue Apr 6 23:07:34 CDT 2004

This seemed 'Geeks List' worthy.

Begin forwarded message:

> From: Dave Farber <dave at farber.net>
> Date: April 6, 2004 3:31:35 AM CDT
> To: ip at v2.listbox.com
> Subject: [IP] Interesting speculation on the tech behind gmail
> Reply-To: dave at farber.net
>
>
> Delivered-To: dfarber+ at ux13.sp.cs.cmu.edu
> Date: Tue, 06 Apr 2004 13:56:09 +0530
> From: Suresh Ramasubramanian <suresh at hserus.net>
> Subject: Interesting speculation on the tech behind gmail
> To: Dave Farber <dave at farber.net>
>
> http://blog.topix.net/archives/000016.html
>
> April 04, 2004
> The Secret Source of Google's Power
> Much is being written about Gmail, Google's new free webmail system. 
> There's something deeper to learn about Google from this product than 
> the initial reaction to the product features, however. Ignore for a 
> moment the observations about Google leapfrogging their competitors 
> with more user value and a new feature or two. Or Google diversifying 
> away from search into other applications; they've been doing that for 
> a while. Or the privacy red herring.
>
> No, the story is about seemingly incremental features that are 
> actually massively expensive for others to match, and the platform 
> that Google is building which makes it cheaper and easier for them to 
> develop and run web-scale applications than anyone else.
>
> I've written before about Google's snippet service, which required 
> that they store the entire web in RAM. All so they could generate a 
> slightly better page excerpt than other search engines.
>
> Google has taken the last 10 years of systems software research out of 
> university labs, and built their own proprietary, production quality 
> system. What is this platform that Google is building? It's a 
> distributed computing platform that can manage web-scale datasets on 
> 100,000 node server clusters. It includes a petabyte, distributed, 
> fault tolerant filesystem, distributed RPC code, probably network 
> shared memory and process migration. And a datacenter management 
> system which lets a handful of ops engineers effectively run 100,000 
> servers. Any of these projects could be the sole focus of a startup.
>
> Speculation: Gmail's Architecture and Economics
>
> Let's make some guesses about how one might build a Gmail.
>
> Hotmail has 60 million users. Gmail's design should be comparable, and 
> should scale to 100 million users. It will only have to support a 
> couple of million in the first year though.
>
> The most obvious challenge is the storage. You can't lose people's 
> email, and you don't want to ever be down, so data has to be 
> replicated. RAID is no good; when a disk fails, a human needs to 
> replace the bad disk, or there is risk of data loss if more disks 
> fail. One imagines the old ENIAC technician running up and down the 
> isles of Google's data center with a shopping cart full of spare disk 
> drives instead of vacuum tubes. RAID also requires more expensive 
> hardware -- at least the hot swap drive trays. And RAID doesn't handle 
> high availability at the server level anyway.
>
> No. Google has 100,000 servers. [nytimes] If a server/disk dies, they 
> leave it dead in the rack, to be reclaimed/replaced later. Hardware 
> failures need to be instantly routed around by software.
>
> Google has built their own distributed, fault-tolerant, petabyte 
> filesystem, the Google Filesystem. This is ideal for the job. Say GFS 
> replicates user email in three places; if a disk or a server dies, GFS 
> can automatically make a new copy from one of the remaining two. 
> Compress the email for a 3:1 storage win, then store user's email in 
> three locations, and their raw storage need is approximately 
> equivalent to the user's mail size.
>
> The Gmail servers wouldn't be top-heavy with lots of disk. They need 
> the CPU for indexing and page view serving anyway. No fancy RAID card 
> or hot-swap trays, just 1-2 disks per 1U server.
>
> It's straightforward to spreadsheet out the economics of the service, 
> taking into account average storage per user, cost of the servers, and 
> monetization per user per year. Google apparently puts the operational 
> cost of storage at $2 per gigabyte. My napkin math comes up with 
> numbers in the same ballpark. I would assume the yearly monetized 
> value of a webmail user to be in the $1-10 range.
>
> Cheap Hardware
>
> Here's an anecdote to illustrate how far Google's cultural approach to 
> hardware cost is different from the norm, and what it means as a 
> component of their competitive advantage.
>
> In a previous job I specified 40 moderately-priced servers to run a 
> new internet search site we were developing. The ops team overrode me; 
> they wanted 6 more expensive servers, since they said it would be 
> easier to manage 6 machines than 40.
>
> What this does is raise the cost of a CPU second. We had engineers 
> that could imagine algorithms that would give marginally better search 
> results, but if the algorithm was 10 times slower than the current 
> code, ops would have to add 10X the number of machines to the 
> datacenter. If you've already got $20 million invested in a modest 
> collection of Suns, going 10X to run some fancier code is not an 
> option.
>
> Google has 100,000 servers.
>
> Any sane ops person would rather go with a fancy $5000 server than a 
> bare $500 motherboard plus disks sitting exposed on a tray. But that's 
> a 10X difference to the cost of a CPU cycle. And this frees up the 
> algorithm designers to invent better stuff.
>
> Without cheap CPU cycles, the coders won't even consider algorithms 
> that the Google guys are deploying. They're just too expensive to run.
>
> Google doesn't deploy bare motherboards on exposed trays anymore; 
> they're on at least the fourth iteration of their cheap hardware 
> platform. Google now has an institutional competence building and 
> maintaining servers that cost a lot less than the servers everyone 
> else is using. And they do it with fewer people.
>
> Think of the little internal factory they must have to deploy servers, 
> and the level of automation needed to run that many boxes. Either 
> network boot or a production line to pre-install disk images. Servers 
> that self-configure on boot to determine their network config and load 
> the latest rev of the software they'll be running. Normal datacenter 
> ops practices don't scale to what Google has.
> What are all those OS Researchers doing at Google?
>
> Rob Pike has gone to Google. Yes, that Rob Pike -- the OS researcher, 
> the member of the original Unix team from Bell Labs. This guy isn't 
> just some labs hood ornament; he writes code, lots of it. Big chunks 
> of whole new operating systems like Plan 9.
>
> Look at the depth of the research background of the Google employees 
> in OS, networking, and distributed systems. Compiler Optimization. 
> Thread migration. Distributed shared memory.
>
> I'm a sucker for cool OS research. Browsing papers from Google 
> employees about distributed systems, thread migration, network shared 
> memory, GFS, makes me feel like a kid in Tomorrowland wondering when 
> we're going to Mars. Wouldn't it be great, as an engineer, to have 
> production versions of all this great research.
>
> Google engineers do!
>
> Competitive Advantage
>
> Google is a company that has built a single very large, custom 
> computer. It's running their own cluster operating system. They make 
> their big computer even bigger and faster each month, while lowering 
> the cost of CPU cycles. It's looking more like a general purpose 
> platform than a cluster optimized for a single application.
>
> While competitors are targeting the individual applications Google has 
> deployed, Google is building a massive, general purpose computing 
> platform for web-scale programming.
>
> This computer is running the world's top search engine, a social 
> networking service, a shopping price comparison engine, a new email 
> service, and a local search/yellow pages engine. What will they do 
> next with the world's biggest computer and most advanced operating 
> system?
>
> Posted by skrenta at April 4, 2004 02:11 PM | TrackBack