Hacker News new | past | comments | ask | show | jobs | submit login

I think it's a case of "well optimized code which does as little as possible can handle an awful lot of users on a single beefy box".



Until it can't. Then five years later you get Spanner.


You can buy some seriously big boxes, and easily split off a lot of services onto multiple boxes. The big problem with the "single big box" strategy is being able to do upgrades -- I see hn go down frequently for 5-10 min at a time in the middle of the night, which I assume is upgrades/reboots.

The happy medium is probably splitting database (master/slave at least) and cdn (if needed) and some other services (AAA? logging?) out, and then having 2+ front end servers with load balancing for availability.


The Arc process hosting HN blows up at least once an hour (I wouldn't be surprised if there was a cronjob restarting it) and much more frequently in peak usage periods.

You wouldn't notice if it weren't for the use of closures for every form and all pagination, every time the process dies all of them are invalid (except in the rare case that they lead to a random new place!).

There's no database, everything is in-memory loaded on-demand from flat files. That wouldn't be so bad except that it's all then addressed by the memory locations rather than the content identifiers! There can be only one server per app, and to keep it real interesting PG hosts all the apps on the same box, during YC application periods he regularly limits HN to keep the other apps more available.


hn doesn't have a database capable of master/slave as such...so I think this will be harder if it ever becomes popular enough. I don't think it gets enough traffic it's ever likely to exceed what you can fit in a single box, from what I know.


Couldn't help but do a little digging..

In the first 99 comments of this page, average comment text size is 231 bytes. Counting all comments in articles on the front page right now, there's 1678 of them, making somewhere around 388kb of comments for the past 12 hours.

So for safety's sake round that to 1mb/day and multiply by site age (5 years).

That gets us 1825mb, projecting forward it's difficult to imagine a time when a single recent SSD on a machine with even average RAM wouldn't be able to handle all of HN's traffic needs. Considering the recent beefy Micron P320h and its 785kIOP/sec, that could serve the entire comment history of Hacker News to the present day once every 2 seconds, assuming it wasn't already occupying a teensy <2gb of RAM.

Even if Arc became a burden, a decent NAS box, gigabit Ethernet, and a few front end servers would probably take the site well into the future. Assuming exponential growth, Hacker News comments would max out a 512GB SSD sometime around 2020, or 2021 assuming gzip bought a final doubling.


Clearly pg should release the dataset and institute a annual round of hn golf where participants compete by recreating hn and trying to get the best performance for a given (changing) deployment target (SSD vs HDD, different RAM & CPU).


HN traffic just needs to grow more slowly than computing power, which seems reasonably likely




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: