Great article. One note: Google's dynamic abstracts were not only very useful, they also improved perceived relevance because they let users see why the pages were selected.
When I was at Altavista, we were also blocked from doing dynamic abstracts by cost.
Google's main advantages were:
- managed by the founders with a total focus on search and measurable results
- google's hiring process produced a very strong team early on
- strong focus on controlling costs from the beginning (Altavista's use of the DEC Alpha was a huge handicap.
Close. Lots of other companies were also hiring pretty high tier talent as well, and had intense focus. Google's success came down to effectively executing across typically disparate disciplines. You have hard core research level CS eggheads, you have top tier software engineers, and you have state of the art data center operations. In a typical organization these groups have competing interests, they fight amongst themselves and in the end some sort of compromise is reached that allows everyone to grudgingly get along.
At google these three groups worked hand in hand and complemented each other's work. The eggheads came up with page rank, the coders figured out how to make pagerank scale through massive paralellism via sharding and mapreduce, and the data center folks figured out how to make sharding cheap and fast through commodity pc based servers and massive amounts of automation for management. In the end everyone was working at the top of their game to help everyone else. The result was that google was able to deliver better results (pagerank) faster (mapreduce) and cheaper (automated commodity hardware datacenters) than the competition.
There were lots of other fine details that led to google's success, but in the end those core factors are what allowed them to deliver a better search experience to users (better/faster) and to be more competitive in the marketplace (lower cost per search means more profit even with lower per search ad revenue).
No one else in search was pushing on all the right pressure points the way google was, and the rest is history.
From the article: "In short, Google had realized that a search engine wasn't about finding ten links for you to click on. It was about satisfying a need for information. For us engineers who spent our day thinking about search, this was obvious. Unfortunately, we were unable to sell this to our executives. Doug built a clutter-free UI for internal use, but our execs didn't want to build a destination search engine to compete with our customers. I still have an email in which I outlined a proposal to build a snippets and caching cluster, which was nixed because of costs."
The engineers here had more than inkling what needed to be done. The problem was this didn't go through the entire company.
I agree, but I'd ascribe that to Google being run by technical founders rather than MBAs. The main benefit of technical company leadership is the ability to "see" across and coordinate the disparate areas.
Infighting and begrudging compromises only happen when the leadership is blind to the details.
Having technical founders is a necessary but not sufficient condition, I think. Lots of companies have had technical founders who haven't managed a level of success as impressive (regardless of scale) as google.
If black people can call themselves niggers and not be insulted, can geeks call themselves eggheads and not be insulted? Not that I condone black people calling themselves niggers, but if anyone can, they can. So why not geeks and the word egghead? Heck, even geek was a high school derogatory word.
When Altavista launched, it was an impressive showcase of the DEC Alpha's power. Intel only became usable for serious servers (with the exception of exotic stuff like Sequents), as did Linux, years later. Google had the good fortune to be in the right place at the right time, when Lintel became a commodity in the datacentre. 5 years earlier, they'd have been on Sun probably.
Altavista launched in 1995 and Google began as a research project in 1996. At my own startup, in 1996, we used Intel because with Sun servers you paid an extreme markup for unnecessary reliability.
I was VP of Engineering at Altavista in 2000, and I started the project to move to Linux. It wasn't easy because search engineering was populated by Alpha fans who were unswayed by the 10x cost advantage.
As late as 2001, I sat in multiple focus groups where all the enterprise customers said Linux was not yet ready for the datacenter. IBM's penguin campaigns were just beginning at that time.
Google's large scale use of Linux was groundbreaking when they launched in 1998.
I'll bet you a dollar that Larry and Sergei never actually bought a Sun server to run the search engine, but rather that they may have used some free resource available to them at Stanford.
It would have affected their cost base certainly, and probably their entire datacentre strategy. With SPARC kit, you wouldn't build assuming that machines will often fail and simply be swapped out, for example, something that Google is famous for.
> With SPARC kit, you wouldn't build assuming that machines will often fail
In the late 90s SPARCs did fail. Yes, they were more reliable than commodity x86 boxes, but they failed often enough that it was an issue if you had 100 or so, and search engines hit that level very quickly.
Right, but look at what Google do, their boxes are basically disposable. Why invest in dual-redundant-hotswappable-everything boxes when you just throw the entire thing away if any bit of it breaks, 'cos it's cheaper to replace it than to even try to repair it in-place.
> Right, but look at what Google do, their boxes are basically disposable
We're talking matter of degree.
The claim was that building a search engine out of 90s sparcs meant that you didn't have to worry about things dying.
That claim is not true - reasonable search engines of that era required enough machines that the failure rate of 90s sparcs, while better than x86 of the time, was enough to require folks to handle frequent failures.
It's reasonable to argue that the cost/benefit tradeoff of sparc's extra reliability vs x85 wasn't worth it for those companies, but that's a different argument.
The better abstracts is the reason I use DuckDuckGo at home.
If I just want to know when the next episode of Big Bang Theory is out or what the weather is today I rarely need to even click on a result.
For more obscure technical searches at work, Google still finds more answers.
But remember - the barrier to change for a search engine's customers is very very low
With the new DuckDuckHack project as well it does make it a lot easier for very quick 'cheap' results. More complex queries I do seem to find myself !g'ing them. Getting better though, improved over the last 3 months I've been using it.
I miss very often the top box too.
I have learned the relevancy of this top box, but I still miss it very often.
My eyes usually go right to the link with "Official site" tag.
I've heard the effect is called banner blindness.
However I use adblock on every browser. Therefore I have less training than others to ignore ads. When I see one, it just hit me stronger (it's a side effect of adblock).
When I was at Altavista, we were also blocked from doing dynamic abstracts by cost.
Google's main advantages were:
- managed by the founders with a total focus on search and measurable results
- google's hiring process produced a very strong team early on
- strong focus on controlling costs from the beginning (Altavista's use of the DEC Alpha was a huge handicap.