Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> You can buy a 1000MHz machine with 2 gigabytes of RAM and an 1000Mbit/sec Ethernet card for $1200 or so. Let's see - at 20000 clients, that's 50KHz, 100Kbytes, and 50Kbits/sec per client. It shouldn't take any more horsepower than that to take four kilobytes from the disk and send them to the network once a second for each of twenty thousand clients.

It was about physical servers.



This is definitely talking about scaling past 10K open connections on a single server daemon (hence the reference to a web server and an ftp server).

However, most people used dedicated machines when this was written, so scaling 10K open connections on a daemon was essentially the same thing as 10K open connections on a single machine.


> You can buy a 1000MHz machine with 2 gigabytes of RAM and an 1000Mbit/sec Ethernet card for $1200 or so

Those are not "by process" capabilities and daemons were never restricted to a single process.

The article focuses on threads because processes had more kernel level problems than threads. But it was never about processes limitations.

And by the way, process capabilities on Linux are exactly the same as machine capabilities. There's no limitation. You are insisting everybody uses a category that doesn't even exist.


Of course daemons weren't limited to a single process, but the old 1 process per connection forking model wasn't remotely scalable, not only because of kernel deficiencies (which certainly didn't help), but also because of the extreme cost of context switches on the commodity servers back then.

Now perhaps my memory is a bit fuzzy after all these years, but I'm pretty sure when I asking about scaling above 15,000 simultaneous connections back in 1999 (I think the discussion on linux-kernel is referenced in this article), it was for a server listening on a single port that required communication between users and the only feasible way at the time to do that was multiplexing the connections in a single process.

Without that restriction, hitting 10,000 connections on a single Linux machine was much easier by running multiple daemons each listening on their own port and just use select(). It still wasn't great, but it wasn't eating 40% of the time in poll() either.

Most of the things the article covers: multiplexing, non-blocking IO and event handling were strategies for handling more connections in a process. The various multiplexing methods discussed were because syscalls like poll() scaled extremely poorly as the number of fds increased. None of that is particularly relevant for 1 connection per process forking daemons where in many cases, you don't even need polling at all.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: