Zed is probably right that poll is faster than epoll when all descriptors are ac...

zedshaw · on Aug 4, 2010

There's nothing in anything I've said that disagrees with your assertion that regular web server (not internet servers) have lots of idle connections. What I've been saying all along is something very simple.

If the ATR > 0.6, poll wins. If ATR < 0.6, epoll wins.

That means, there's potential gains to be had by using both, and at a minimum you can use both and it won't hurt you very much.

Additionally, Mongrel2 isn't a regular server. Remember all those pesky 0MQ backends? Those basically crank data fast as shit, so this research could potentially benefit them. It also points out potential flaws in previous analyses, so people can now potentially use ATR to tune their server's work loads appropriately.

But notice I keep saying potentially? So far I'm the only one offering everything up, all data, everything for anyone to prove me wrong, and I actually don't think I'm totally right.

I don't see very many other people who criticize the idea doing the same. Just more comments on HN.

jfager · on Aug 4, 2010

What I've been saying all along is something very simple.

If the ATR > 0.6, poll wins. If ATR < 0.6, epoll wins.

You left out the part where you accused people who advocate using epoll of being bullshitting touts who perpetuate fallacies:

"why the touted benefits of epoll and most of the information out there is mostly bullshit because of some falacies people seem to have about epoll (mostly perpetuated by epoll's proponents)."

Which you proceeded to back up with a strawman argument about claims of O(1) performance and a rant about how graphs that show that epoll accomplishes exactly what epoll is designed to accomplish "flat out get it wrong".

You're using "ATR" to find the inflection point in the performance curve from epoll's constant overhead per active fd, for the purpose of micro-optimizing Mongrel2. That's fine. But when you claim that people who just use epoll because it solves the problem they actually need solved are making decisions based on "bullshit", you shouldn't act so surprised that they don't respond positively.

zedshaw · on Aug 4, 2010

The touted benefits of epoll are mostly bullshit, since it was advocated as being O(1) and frequently mentioned as being O(1) and always faster than poll by others. In fact, right after mentioning that I thought epoll wasn't faster the first thing people said was it was O(1). Every person I talked to said it.

I have evidence that contradicts both of the main assertions of epoll, so until someone comes up with counter evidence I can safely say that epoll information is bullshit, and my argument is nothing like a straw man.

So yes, it is bullshit, and if that offends you then awesome. Because nothing's worse than blindly believing some load of crap when you could have a better understanding of what you're doing.

jfager · on Aug 4, 2010

I googled 'epoll "O(1)"'. You're the first hit (congratulations), followed by some random person on stackoverflow, the documentation for a CPAN module, the repetition of that documentation on github and in various package repos, and some random bloggers and mailing list responders who are either just referring to the work that you have to do to find an active fd when the wait call returns, or are confused.

The man page says nothing about O(1):

http://www.kernel.org/doc/man-pages/online/pages/man4/epoll....

The original epoll writeup says nothing about O(1), and shows graphs explicitly for dead connection scenarios (iow, it never claims better performance per active fd):

http://www.xmailserver.org/linux-patches/nio-improve.html

The Dan Kegel writeup that's pretty much universally cited on this issue says nothing about O(1):

http://www.kegel.com/c10k.html#nb.epoll

The link you threw up yesterday that "flat out gets it wrong" says nothing about O(1) and simply shows epoll doing exactly what it was designed to do (again, it never claims better performance per active fd), including graphs that actually do show poll outperforming epoll with small numbers of dead connections:

http://lse.sourceforge.net/epoll/index.html

Can you show any authoritative sources who claim epoll is O(1)? If not, then yes, you built a fluffy little strawman and took him down hard.

I have no problem with your work on superpoll. I'm skeptical it will help much, guessing there will be situations it will hurt, and overall not convinced it's worth the effort to find out, but it's your time. Have fun with it, and I'm eager to see the results.

I just don't get why you feel the need to imply that everyone who ever used epoll without finding the ATR performance cut-off is a blind moron snowed into doing so by some shadowy malevolent epoll cabal spreading misinformation about how it works.

pphaneuf · on Aug 5, 2010

I've never perceived epoll as being O(1), always O(N), the big thing being the N being how many events are dispatched, rather than the number of watched fds.

I'm fine with the amount of time taken scaling linearly with the amount of work to do, I'm not cool with it scaling linearly with the number of flowers in the garden. :-)

rbanffy · on Aug 4, 2010

> there's potential gains to be had by using both

Only if your ATR crosses the .6 boundary frequently. If it consistently remains on either side, the performance would be worse at the cost of added complexity.

I agree with your assumption we need more instrumentation. I strongly suspect my ATRs are consistently closer to zero, but I never looked under the hood to check.

Also, the ATR should vary according to what you are serving. Longer streams would drive the ATR up while serving shorter bursts would drive it down.

zedshaw · on Aug 4, 2010

Exactly, nobody knows because they've been operating under the assumption that epoll is always faster so they never looked.

Now I've got some evicence that epoll isn't faster, and in fact it's a wash at 40% "utilization", which is sort of ridiculous. I'd want people to go and test and see what they get, or at least know the implications so they pick wisely.

Ultimately though, I'd rather have the Linux kernel just fix epoll so it's always faster than poll.

rbanffy · on Aug 4, 2010

> Ultimately though, I'd rather have the Linux kernel just fix epoll so it's always faster than poll.

Unless they break poll, that's very unlikely. The implementation of poll is much simpler and should be faster per fd. What could happen is that epoll gets some improvement that moves up the .6 boundary but I seriously doubt epoll could ever be faster than poll with 100% active fds - and I never encountered that kind of usage.

As with any optimization, we should be concerned with end result: how much slower does using epoll really makes the system. I am on the skeptical side here.

pphaneuf · on Aug 5, 2010

Hey, some idea I remembered (from some old paper): don't go from epoll to poll when the ATR gets too high, go to... NOTHING!

Just read() or write(), as if something told you they were ready! Your code has to be able to handle EAGAIN anyway, right? If you're at 1.0 ATR, you're saving the whole cost of poll or epoll!

I'm actually serious here, BTW. It would need some measurement to figure out the threshold at which this is okay, and if it's close enough to 0.6, I'm guessing you could go straight from epoll to nothing, and take the small (potential) performance hit between 0.6 and 0.X.

Hmm, I'm an idiot: this is basically epoll in edge-triggered mode.

derefr · on Aug 4, 2010

I believe that, other than in the new-for-the-sake-of-new sysadminning camp, people almost never start by deploying Mongrel. They deploy Apache, because it's easy (every distro comes with an excellent default set of LAMP packages, for obvious reasons.) Thus, Mongrel (or any other non-mindshare-majority server) is only sought out by those who don't like the one they've got—that is, by people who are not getting the performance they desire.

Mongrel is probably not used by Joe the VPS user, serving 20 simultaneous connections per hour, because Joe is doing just fine with the Apache install that came with his slice. Mongrel is deployed for high-traffic file servers, as the current weapon of choice.

It's very likely that that will continue to be its niche (though it will expand out somewhat as larger shared-hosting providers start to default to it), so it's very likely to mostly see use in high-ATR deployment scenarios. Under that assumption, Zed's experiment is solid. I would still like to see a prologue tacked on about the composition of the Mongrel user base, though, if Zed has it :)

Edit: rephrased to... sound less like Zed?

yummyfajitas · on Aug 4, 2010

Thus, Mongrel (or any other non-mindshare-majority server) is only sought out by those who don't like the one they've got - that is, by people who are not getting the performance they desire.

This is probably true for most non-apache servers now, but I don't think it will necessarily be true for Mongrel2.

Mongrel2 gives us a new development model - http calls -> messages, language doesn't matter. I can easily mix python and c++/Haskell, for instance [1]. It's rather different from what we've had before.

Mongrel isn't just a faster apache; people might use Mongrel2 for features rather than performance.

[1] Last time I tried to do this, I basically put a halfassed version of mongrel2 behind apache/django. I.e., apache/django handled http and sent messages to my c++ daemons. Fugly.

zedshaw · on Aug 4, 2010

Exactly, this is why I was curious about the platitude that epoll wins all the time.