Hacker News new | past | comments | ask | show | jobs | submit login

Don't think so. They used to use Cloudflare but stopped. To my knowledge, it's a single server without a database (using the filesystem as a database).



So HN is serving 5.5M page view daily (excluding API access ) on a single server without CDN and without a database?

Holy crap I am thinking either there is some magic or everything we are doing in the modern web are wrong.

Edit: The number is from Dang [1]

>These days around 5.5M page views daily and something like 5M unique readers a month, depending on how you try to count them.

[1] https://news.ycombinator.com/item?id=23808787


>Holy crap I am thinking either there is some magic or everything we are doing in the modern web are wrong.

Spin up an apache installation and see how many requests you can serve per second if you're just serving static files off of an SSD. It's a lot.

edit: I see that there are already a bunch of other comments to this effect. I think you're comment is really going to bring out the old timers, haha. From my perspective, the "modern web" is absolutely insane.


>* From my perspective, the "modern web" is absolutely insane. *

Agreed.

I was brought up as a computer systems engineer... So, not a scientist, but I always worked with the basic premise of keep it simple. I've worked on projects where we built all the fangled clustering and master/slave (sorry to the PC crowd, but that's what it was called) stuff but never once needed it in practice. Our stuff could easily handle saturated gigabit networks as the 2 core cpu only running at 40%. We had cpu spare and could always add more network cards before we needed to split the server. It was less maintenance, for sure. It also had self healing so that some packets could be dropped if the client config allowed it, if the server decided it wanted to (but only ever did on the odd dodgey client connection)

That said, I was always impressed by the map-reduce of for search results (yes, I know they've moved on) which showed how massive systems can be fast too. It seemed that the rest of the world wanted to become like Google, and the complexity grew for the std software shop, when it didn't need to imho.

I jumped ship at that point and went embedded, which was a whole lot more fun for me.

Sincerely, old timer


[flagged]


How about we spend our energy fixing systemic/institutional racism first, because language will follow quite naturally.

The other way around surely doesn't work, and is just symbolic gestures without actual change.


There is only so much i can do - i'm not american and thats a change i can't make on my own aside from doing my best to be an ally when possible.

However i can open a few PRs and use some of my time to make that change. It's a minor inconvenience to me and if it makes even one black person feel heard and supported then yea, i'm gonna do it.


>I think you're comment is really going to bring out the old timers, haha.

That is great ! :D

>It's a lot.

Well yes, but HN isn't really static though. Fairly Dynamics with Huge number of users and comments. But still, I think I need to rethink lots of assumption in terms of speed, scale and complexity.


Huge numbers of users don't really mean that much. Bandwidth is the main cost, but that's kept low by having a simple design.

Serving the same content several times in a row requires very few resources - remember, reads far outnumber writes, so even dynamic comment pages will be served many times in between changes. 5.5 million page views a day is only 64 views a second, which isn't that hard to serve.

As for the writes, as long as significant serialization is avoided, it is a non-issue.

(The vast majority of websites could easily be designed to be as efficient.)


There is some caching somewhere as well, probably provides a bit more boost.

I've been at my work laptop (not logged in) and found something I wanted to reply to, so I pulled out my phone and did so. For a good 10 seconds afterwards, I could refresh my phone and see my comment, but refresh the laptop and not see it.


> From my perspective, the "modern web" is absolutely insane.

You know, it should be even better than it was in the past, because a lot of heavy lifting is now done on the client. If we properly optimized our stuff, we could potentially request tiny pieces of information from servers, as opposed to rendering the whole thing.

Kinda like native apps can do(if the backend protocols are not too bloated)


> Holy crap I am thinking either there is some magic or everything we are doing in the modern web are wrong.

It doesn't need to be crazy.

A static site on DigitalOcean's $5 / month plan using nginx will happily serve that type of traffic.

The author of https://gorails.com hosts his entire Rails video platform app on a $20 / month DO server. The average CPU load is 2% and half the memory on the server is free.

The idea that you need some globally redundant Kubernetes cluster with auto fail-over capabilities seems to be popular but in practice it's totally not necessary in so many cases. This outage is also an unfortunate reminder that you can have the fanciest infrastructure set up ever and you're still going down due to DNS.


> The idea that you need some globally redundant Kubernetes cluster with auto fail-over capabilities seems to be popular but in practice it's totally not necessary in so many cases

True, but this is why it shouldn't be bashed either. When you need it, you need it (cue very complex enterprise applications with SLA requirements).


> True, but this is why it shouldn't be bashed either. When you need it, you need it (cue very complex enterprise applications with SLA requirements).

To support this, look at how many people criticize Kubernetes as being too focused on what huge companies need instead of what their small company needs. Kubernetes still has its place, but some peoples expectations may be misplaced.

For a side project, or anything low traffic with low reliability requirements, a simple VPS or share hosting suffices. Wordpress and PHP are still massively popular despite React and Node.js existing. Someone who runs a site off of shared hosting with Wordpress can have a very different vision about what their business/sideproject/etc will accomplish compared to someone who writes a custom application with a "modern" stack.


Modern web is a completely broken mess.

We were serving around that traffic off a single dual pentium 3 in 2002 quite happily off IIS/SQL Server/ASP. The amount of information presented has not grown either.

That little box had some top tier brand main corporate web sites on it too and was pumping out 30-40 requests a second peak. There was no CDN.


You were not serving that traffic, you were just serving your core functionality - no tracking, no analytics, no ads, no a/b, no dark mode, no social login, no responsiveness. Are most of those shitty? Sure, just let me know when you figure out how to pry a penny from your users for all your hard work.


Oh no, not the dark mode! The sacrifices we have to make for performance I guess...


Easy. We built something that was worth money without all that.

Not a one trick marketoid pony.


Dark mode and responsive webdesign are both good for the user and efficient for the server and user's device.


That means an average of about 63 pages per second. Let's say that the total number of queries is tenfold and take a worst case scenario and round up to 1000 queries per second and then multiply by ten to get 10k queries per second, because why not.

I don't know what the server's specs are but I'm sure it must be quite beefy and have quite a few cores, so let's say that it runs about 10 billions instructions per second. That means a budget of about one million instructions per page load in this pessimistic estimate.

The original PlayStation's CPU ran at 33MHz and most games ran at 30fps, so about 1million cycles per fully rendered frame. The CPU was also MIPS and had 4KiB of cache, so it did a lot less with a single cycle than a modern server would. Meanwhile the HN servers has the same instruction budget to generate some HTML (most of which can be cached) and send it to the client.

A middle of the line modern desktop CPU can nowadays emulate the entire PlayStation console on a single core in real time, CPU, GPU and everything else, without even breaking a sweat.

>Holy crap I am thinking either there is some magic or everything we are doing in the modern web are wrong.

Magic, clearly.


That's only ~60 QPS, assume it is peaky and hits something more like 1000 QPS in peak minutes, but also assume most of the hits are the front page which contains so little information it would fit literally in the registers of a modern x86-64 CPU.

Even a heavyweight and badly written web server can hit 100 QPS per core, and cores are a dime a dozen these days, and storage devices that can hit a million ops per second don't cost anything anymore, either.


In-memory databases? That's amateurish. Time to make a service that runs out of ymm registers.


Not sure where your 5.5M number came from, but that's only 64 requests per second.

90 to 99% of those are logged-out users, so fully cacheable.

Only a handful of dynamic requests each second remain.


Unlike Reddit, logged in and logged out users largely see the same thing. I wouldn't imagine there is much logic involved in serving personalized pages, when they don't care who you are.


The username is embedded in the page, so you can't do full caching unfortunately. But the whole content could be easily cached and concatenated to header/footer.


You also can't do that every time because of hidden, flagged, showdead, etc.


That could be split as a setting in the user-specific header with the visibility part handled client-side. A bit more work, but it's not impossible if it's worth it.


Oh! I have never looked an HN frames, etc., but I assumed the header was separate. Thank you!


It's amazing how little CPU power it takes to run a website when you're not trying to run every analytics engine in the world and a page load only asks for 5 files (3 of which will likely already be cached by the client) that total less than 1 MB.


It is certainly doable , PoF was running lot of page views famously of a single IIS server for a long time .

HN is written in a lisp variant and most the stack is built in-house , it is not difficult to imagine efficiency improvements when many abstraction layers have been removed from your stack .


I don't remember PoF being famous for that, but they got a lot of bang for the buck on their serving costs.

What I do remember, is that it was a social data collection experiment for a couple of social scientists, that never originally expected that many people would actually find ways to find each other and hook up using it.

I miss their old findings reports about how weird humans are and what they lie about. Now, it's just plain boring with no insights released to the public.


POF was also run by a single dude until he sold it for some 600 million dollars!


There are couple of old posts about it in hpc blog .

Nick carver from SO also one mentioned that they could run SO if a single server , while it wasn’t fun it was doable and had happened some time .


For all my sibling comments, there is also context to be aware of. 5.5m page views daily can come is many shapes and sizes. Yes, modern web dev is a mess, but situation is very different from site to site. This should be taken as a nice anecdote, not as a benchmark.


You can serve a lot of flat files from a properly configured server in 2020. It's just that most people don't bother trying.


You don’t need a CDN if what you’re serving up in this case is mostly all text.

Just need good stable code and server side caching.


Back in 2000 a joke project of mine got slashdotted. Ran outta bandwidth before anything else.


With DO, these days, they don't run me out of bandwidth, but my instance falls over (depending on what I am doing - which ain't much), but with AWS, they auto-scale and I get a $5000 bill at the end of the month. I prefer the former.


Yeah, the bandwidth overage was a grand, give or take. It was a valuable lesson in a number of ways, and why for any personal things I wouldn't touch AWS with a shitty stick.


AWS only auto scales if you configure it to...


Yeah, but I'm stupid and didn't understand all the switches.


S3 will scale as much as is needed by itself


A lot of modern web technology is inefficient for the sake of being ergonomic. Here's what Hacker News looks like: https://github.com/arclanguage/anarki/blob/master/apps/news/...


From an old-school lisper perspective, the code seems perfectly ergonomic to me.

It's ergonomic in a very lispy way but perfectly reasonably so from the POV of that aesthetic.


Well, it's not magic. So, the other one.


I had this same reaction. Definitely feels like most of what we’re doing with “the modern” web is probably wrong.


That's why we have this bloated over-engineered multi-node applications: People just underestimate how fast modern computing is. To serve ~2^6 requests/sec is trivial.

It's easily served by a simple server.


1M queries per day is ~10 queries per second. It's a useful conversion rate to keep in mind when you see anyone brag about millions of requests per day.


that number is not very big

i used to host a wordpress site that has 5M pageviews a month on a $10 (and later $20) digitalocean instance.

that's wordpress and a shared vps. I imagine it could be a lot higher if I have dedicated server and use self-written software.


You have to remember there's a lot of seconds in a day, that's only 60 qps.


HN could probably be served by Python running on a fancy laptop.


Everything we're doing is wrong.


Flat-file DBs and mountable DB file systems are the future.


Man, if this is true, this guys have steel balls.


We've been telling this for a while now...


I don't get this comment, what does page serving performance have to do with "the modern web"? It's not as if serving up a js payload would make it more difficult to host millions of requests on a single machine, html and js are both just text.


Makes sense based on what I've read about Arc, which HN is written in.

I've been working on something where the DB is also part of the application layer. The performance you can get on one machine is insane, since you spend minimal time on marshalling structures and moving things around.


"They used to use Cloudflare but stopped."

They are still using Cloudflare. Unlike CF, M5 does not require SNI.

   curl --resolve news.ycombinator.com:443:104.20.43.44 https://news.ycombinator.com




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: