Accidentally nonblocking

jacquesm · on June 6, 2016

I don't know how many times I looked at the output of c-preprocessors and compilers to figure out what the heck was going on. One choice example of this was a pretty complex system that managed to call a top level routine from somewhere deep down in the stack if an error occurred (which promptly led to a stack overflow that would only very rarely trigger).

The 'nonblocking' here is just a symptom of a much larger problem: abstraction is a great way to build complex stuff out of simple parts but it is also a great way to introduce all kinds of effects that you weren't aiming for in the first place and this particular one is easier than most to catch. You can find the same kind of problems at all levels of software systems, all the way up to the top where dosomethingcomplex() and if it fails dosomethingcomplex() again is the cure.

Writing easy to understand code is a big key to solving this kind of problem, I've always tried (but probably never succeeded) in writing code in the simplest way possible, as soon as I find myself reaching for something clever I feel it is a mistake. Either circumstance (some idiot requirement, such as having to use the wrong tools for the job) or need may be used occasionally to transgress the rule but if you do it with any regularity at all (and without documenting the particular exception and the motivation to go outside of the advised lines) you are almost certainly going to regret it. (Or your successor may one day decide to pay you a house-call with a blunt object...)

catnaroek · on June 6, 2016

> abstraction is a great way to build complex stuff out of simple parts but it is also a great way to introduce all kinds of effects that you weren't aiming for in the first place and this particular one is easier than most to catch.

This isn't a problem when abstractions don't leak. Polishing abstractions until they don't leak is super hard, though.

ArkyBeagle · on June 6, 2016

Ignoring all that - in order to abstract something, you have to either 1) make assumptions or 2) establish a method for the configuration of those assumptions.

We all (naively) want it just to be handled for us. But sometimes that doesn't work out. We are the ones who have to learn that; the Second Law of Thermodynamics ( which is the lynchpin of the Two Generals Problem ) is unlikely to change to accommodate our foolishness :)

As I understand you, "polishing abstractions until they don't leak" is equivalent to "doing the whole job, not just part of it." Economically, this is a pain point for the people we work for. It sounds expensive. The accounting for it is very difficult. "Can't you just make it work" is not unreasonable.

Narrow is the way.

catnaroek · on June 6, 2016

> in order to abstract something, you have to either 1) make assumptions or 2) establish a method for the configuration of those assumptions.

Most importantly, you need to enforce the assumptions. The lack of enforcement is where abstraction leaks come from.

> the Second Law of Thermodynamics ( which is the lynchpin of the Two Generals Problem ) is unlikely to change to accommodate our foolishness :)

The second law of thermodynamics is fundamental to understanding how the physical world works, but software is a purely logical artifact.

> As I understand you, "polishing abstractions until they don't leak" is equivalent to "doing the whole job, not just part of it."

It means redesigning the abstraction until there are no cases uncovered by the abstraction's enforcement mechanisms.

> Economically, this is a pain point for the people we work for. It sounds expensive.

Make no mistake, it is expensive. But dealing with abstraction leaks is even more expensive.

> The accounting for it is very difficult. "Can't you just make it work" is not unreasonable.

It doesn't work if it breaks.

ArkyBeagle · on June 6, 2016

Enforcement is the entire point. A failed return from a recv() may be an application problem. It doesn't compress.

> ..software is a purely logical artifact

No. No, sir , it is not. There is no magical unicorn version of communications in which you can simply assume it all always gets there instantly and in order. We can get close - severely underutilized Ethernet & 802.11 spoil us - but nuh uh.

> Make no mistake, it is expensive.

And you wonder why they are like they are :) "you can't afford it, honey." :)

> It doesn't work if it breaks.

Indeed.

catnaroek · on June 6, 2016

> No. No, sir , it is not. There is no magical unicorn version of communications in which you can simply assume it all always gets there instantly and in order. We can get close - severely underutilized Ethernet & 802.11 spoil us - but nuh uh.

That simply means you want an unimplementable abstraction. (Perfectly reliable sequential communication over a computer network.) Of course it doesn't make sense to want impossible things.

> And you wonder why they are like they are :) "you can't afford it, honey." :)

This brokenness can't be fixed at the level of business applications. Languages and standard libraries need to be fixed first.

ArkyBeagle · on June 6, 2016

I forget what the thing you just did is called, but you've managed to switch sides. :) I'm the one who said there is no unicorn version etc. ....

You can't fix that in a library. There is a sequence of escalation. Failures are formally checked-for and counters are incremented, alarms are sent, actions are taken...

You may not be interested in the Second Law, but the Second Law is interested in you.

catnaroek · on June 6, 2016

> I forget what the thing you just did is called, but you've managed to switch sides. :)

I didn't switch sides. I stand by my assertion that software is a purely logical artifact. The laws of thermodynamics have no bearing on whether redirecting the control flow to a far-away exception handler (or, even worse, undefined behavior) is a reasonable way to deal with unforeseen circumstances.

> I'm the one who said there is no unicorn version etc. ....

I'm not talking about unicorns, only about abstractions that don't leak. That being said, I'll admit that sometimes there are good reasons for using leaky abstractions. My favorite examples of this is garbage collection. The abstraction is “you can always allocate memory and you don't need to bother deallocating it”. The second part is tight, because precise collectors guarantee objects will be reclaimed a bounded number of cycles after they become unused. But the first part is leaky, because the case “you've exhausted all memory” is uncovered. The reason why this isn't a problem in practice is that most programs don't come anywhere near exhausting all available memory, and, if it ever happens, ultimately the only possible fix is to add more RAM to the computer.

FWIW, I don't consider TCP a leaky abstraction, because it doesn't promise that actual communication will take place. It only promises that, if messages are received, they will be received in order by the user of the abstraction. That being said, most TCP implementations are leaky, as is pretty much anything written in C.

ArkyBeagle · on June 6, 2016

Quoth Spolsky: "All nontrivial abstractions are leaky."

This means you still have to deal with it.

catnaroek · on June 6, 2016

> Quoth Spolsky: "All nontrivial abstractions are leaky."

Well, no, that's wrong. Abstractions can be made tight, but that requires discipline and hard work.

wahern · on June 6, 2016

Lest somebody get the wrong idea from his post, note that he's not arguing to use poll on sockets that aren't non-blocking (i.e. without the O_NONBLOCK flag on the open file table entry).

When a socket polls for readiness in Unix, it does not mean that a subsequent read will succeed. The obvious case is when another thread reads from the socket before you do. A less obvious case is that some kernels, such as Linux, implement lazy checksum verification. Linux will wake up any waiting threads when a packet comes in (including marking an open file table entry as readable), but the checksum isn't verified until an actual read is attempted. If the checksum fails, the packet is silently discarded. If the socket wasn't in non-blocking mode, your application will stall until the next packet is received.

The JRE had (and maybe still has) a bug like this, where it assumed poll meant that a subsequent read was guaranteed to succeed or fail immediately.

This particular issue is less common today with checksum hardware offloading, but the correctness and robustness of your software probably shouldn't depend on particular network chipsets.

Another bug I've seen several times is assuming that a write to a UDP socket won't block. You can usually get away with this on Linux because the default buffers are so huge. As with the above issue, it really only shows when your application (and thus the network) is under significant load.

One conclusion I draw from this is that while people go to great lengths to implement a supposedly scalable architecture, most of the time developers never see the kinds of heavy load that such architectures are designed for. If they had, they would have discovered these sorts of issues. Fortunately or unfortunately for me, I discovered both of the above issues the hard way.

[1] If you're wondering why I kept writing "open file table entry" instead of descriptor, it's because they're not the same thing. And some day I expected a few CVEs to be issued related to overlooking such distinctions. For example, on BSDs /dev/fd/N duplicate a descriptor point to the same file table entry, just as dup(2) does. On Linux /dev/fd is a symlink to /proc/self/fd. /proc/self/fd creates a new file table entry. In the former case, software setting or unsetting O_NONBLOCK effects all other references to that entry.

comex · on June 7, 2016

> When a socket polls for readiness in Unix, it does not mean that a subsequent read will succeed.

Yikes, I didn't know Linux did that. That sounds like a serious spec violation to me. POSIX says:

> POLLIN

> Data other than high-priority data may be read without blocking.

http://pubs.opengroup.org/onlinepubs/009695399/functions/pol...

It's hard to interpret that other than as a promise not to block. Oh, and the Linux poll(2) man page doesn't even mention the caveat. The select man page does (I assume the actual behavior applies to poll too), but here POSIX is even more explicit:

> A descriptor shall be considered ready for reading when a call to an input function with O_NONBLOCK clear would not block, whether or not the function would transfer data successfully. (The function might return data, an end-of-file indication, or an error other than one indicating that it is blocked, and in each of these cases the descriptor shall be considered ready for reading.)

ArkyBeagle · on June 6, 2016

There is more than one checksum. At layer 2, the checksum is its own thing. At layer three, a partial read means the checksum isn't necessarily here yet - assuming the checksum is relevant ( UDP makes checksums optional ).

IMO, you really need to make writes to a UDP socket explicitly nonblocking and check the error codes.

geofft · on June 6, 2016

Cory Benfield's PyCon talk last week, "Building Protocol Libraries the Right Way" (https://www.youtube.com/watch?v=7cC3_jGwl_U), makes the argument that a large number of problems can be traced to not cleanly separating responsibilities of actually physically doing I/O and making semantic sense of the bytes. His primary worry was about reimplementing things like HTTP many times, once for each I/O framework (why do Twisted, Tornado, and asyncio all have their own HTTP implementation?). But it seems the same problem can be seen here: every single part of the code thinks it knows how to actually retrieve data from the network, so it interacts with the network on its own, causing nested polling and similar awkwardness. If every part of the event-processing code thinks it knows how to do network I/O, you have many more opportunities for getting network I/O wrong.

If xterm were designed so that e.g. xevents() had only the responsibility of fetching bytes from the X socket and do_xevents() and everything else had only the responsibility of handling bytes from an buffer, there would be no temptation to poll in two different functions. Only one function would even know that the byte source is a socket; the rest just know about the buffer.

jerf · on June 6, 2016

One of the nice things about Go is that the io.Reader and io.Writer interface being written into the base libraries means a lot of code gets this right, and only expects a stream rather than "a socket".

The takeaway here is not that Go is awesome; the takeaway is a lesson on the importance of getting a very early release of a language and its stdlib correct. The vast majority of modern languages today could trivially-to-easily do the same thing, but they don't in the standard lib, so the first couple of libraries end up string based, so the next libraries that build on those end up based on strings, and before you know it, in practice hardly anything in the ecosystem is implemented this way, even though in theory nothing stops it from happening. (Then around year 3 or 4, a big library gets built that does this correctly, but it's too late to retrofit the standard library and it only ever gets to about 10% penetration after a lot of reimplementation work.)

yetihehe · on June 6, 2016

The more I see such problems the more I like erlang. Most socket handling libraries split handling into protocol handling layer and application layer. Protocol layer ensures there is full message available and application layer handles only full messages. Most of the time it's the simplest and most natural way to do anything in erlang.

jacquesm · on June 6, 2016

Erlang really gets this right. Abstract out all the generic server stuff and have it coded up by experts, then have the application programmers concentrate on the application. A bit like programming a plug-in for Apache but then extrapolated to just about anything you could do with a server. Erlang is a very interesting eco-system, the more I play around with it the more I like it and the way it is put together. If it had a shallower learning curve it would put a lot of other eco-systems out of business. But then again, the fact that it doesn't makes it something of a secret weapon for those shops and individuals that have managed to really master it.

raarts · on June 6, 2016

> if it had a shallower learning curve

Elixir is meant to address that.

JoshTriplett · on June 6, 2016

> If xterm were designed so that e.g. xevents() had only the responsibility of fetching bytes from the X socket and do_xevents() and everything else had only the responsibility of handling bytes from an buffer, there would be no temptation to poll in two different functions.

X is an interesting special case. The X protocol has some special cases where you have to make sure you read before you write, or vice versa; doing the wrong buffering or blocking operation can result in a deadlock between you and the server.

I certainly enjoyed that PyCon talk, and I agree with the conclusion; however, there are some special-case protocols like X where integrating them into your main loop requires some special protocol-specific care.

geofft · on June 6, 2016

I think you can solve this by reporting an "I can't read unless you write some more" event, or allowing a "I can't write unless you process some events" return code from the write function. You need some protocol awareness (you can't completely abstract every protocol as bytes -> JSON and JSON -> bytes), but it doesn't rise to the level of letting application code directly have access to the underlying file descriptor.

I believe both SSL and SSH have similar issues, where the state of the protocol client requires that you order reads and writes in some way to avoid deadlock. I guess TCP also has the a similar risk with window sizes going to 0, and in practice, hiding TCP behind a UNIX file descriptor and a relatively constrained socket API works fine; client apps don't need to care about the exact state of the TCP implementation.

nicolast · on June 6, 2016

https://blog.incubaid.com/2011/12/13/hybrid-sync-async-pytho...

osivertsson · on June 6, 2016

Imagine for a moment how programs would be different if all polls had timeouts and all sockets were blocking. For a little while, there’d be some unpleasant stalls. But these would not be insurmountable problems. With a little concentration, it’s possible to rearchitect the program with a much more robust design that neither loses events nor requires speculative guesses.

Yes please, I'd like that. The code that ends up on my desk would be easier to understand and refactor.

Instead of a proper fix, the developer changes the socket to nonblocking

Sometimes yes, and sometimes the developer decide to spawn a thread, and now you have lots of problems...

curried_haskell · on June 6, 2016

n.da nyouow hlotsof probave. . lems

ArkyBeagle · on June 6, 2016

I've seen EAGAIN as well l as EBADF errors as a "normal" part of operation against TCP sockets. I say "normal" because I've only seen EBADF once and it was because the client side started talking too early. IOW, when select()/poll() tell you socket 13 is ready and recv() tell you to EBADF, then the socket is just not ready to go just yet. Go around again.

The client side grew up against serial ports ( yes, those are still a thing ) , where you don't have this problem. The owner of the client side was more or less in incredulous terror when I broached this subject. Sigh. So I just ignored them. Big sigh. If it failed, there were retries so the only cost was a little delay now and again.

You cannot fragment UDP unless you're prepared to add some method of sequencing as part of the application protocol. Each UDP PDU needs to be fully atomic otherwise.

For cases analogous to SNMP row creation ( in which multiple varbinds[1] determine the outcome ), there is the "as if simultaneous" rule as a heuristic - all PDUs related to creating a row must be cached and only applied when the row state is set. And sometimes you can configure things to send all varbinds in one PDU.

[1] a varbind is a triple of the set/get/next/multi operator, the object id and if applicable a value as encoded by the SNMP Basic Encoding Rules.

So your little serialization protocol? It suffers all the heartache of a full-on transactional database processing system.

These things are this way because communications are like that.

mark242 · on June 6, 2016

This is where I think Scala really, really shines, regardless of whether you're using Akka or not. Once you've gotten into that mode of using futures, turning your code from blocking to nonblocking is as simple as never writing that Await statement, but having your methods return a Future[MyClass] instead of MyClass.

The funny thing is that writing nonblocking code doesn't have to be as hard as it is, you just have to get into the mindset. It's easy to say "well, I have to have the result coming back from my JDBC/REST/etc call before I know what to do with it" and that's not the case at all, especially when you're working in a strongly-typed environment.

janvidar · on June 6, 2016

The problem here is that most libraries need to integrate into some sort of main loop somehow, and unfortunately there are lots of different ways of doing the main loop of the application. Some libraries integrate other things which are not directly poll()-able, but expose the same interface while doing so.

Now you have the problem when trying to combine multiple such libraries. For example, you can try using GTK+ and QT in an application at the same time.

One thing that has always bugged me is that there is a lack of standardized (cross platform) non-blocking DNS lookup functionality. This adds to the main-loop complexity, since you have to poll() certain types of resources, and have to deal with threads or subprocessed in order to look for DNS results.

Well written frameworks like Qt abstract away this complexity, but that may not always play nice when mixing libraries.

ArkyBeagle · on June 6, 2016

But the complexity of handling error code returns from UDP/TCP stacks is fundamentally irreducible. IMO, and it's just that - I'd rather deal tactically with fully nonblocking sockets than gamble on the writer of a library's wrapping of it. If it turns out the library works then bonus - but an error code at the socket layer may ripple all the way up to the ... UX layer. The socket handler is the central artifact.

If you can, try a socket thing in Tcl. It (SFAIK ) completely asbtracts all the ugly away. Stuff built properly ( see the Brent Welch book for "properly" ) in Tcl will be - again, SFAIK, after ... hundreds of these things ) fully reference grade. And they're very cheap to build.

I strongly recommend at least being able to build socket handlers in Tcl because eventually, you'll get into a "he said/she said" over a comms link and using Tcl to test your side is extremely convenient. I had a boss who liked Wireshark and I told him - I don't need Wireshark; I have this test driver. I said this because say you have 2GB of Wireshark spoor. Now what? Just print it out and use a highlighter?

nathanappere · on June 6, 2016

"So the net result was that this optimization really resulted in an extraneous recvfrom call per request, which returned EAGAIN. What was I thinking?"

That only happens if somehow you have received data that is EXACTLY the size of the buffer you send to recvfrom. If you have read less, you know that there is no need to call it again. If your buffer is full, then odds are there is more data to read so the next call won't be useless.

I suspect you actually had an extremely small proportion of "extraneous" calls.

spc476 · on June 6, 2016

For UDP packets, requesting a read of less than a full packet will discard the rest of the packet. This does not happen for TCP packets though.

Darn those leaky abstractions.

Matt3o12_ · on June 6, 2016

I was lost after the second code example (I'm not a c programmer but I'm quite comfortable in C-like languages (Go, Python, Java).

    void
    xevents(void)
    {
        if (poll() || poll())
             while (poll()) {
             /* ... */
             }
    }

What advantages does this code produce and how is it related to the first example? Why would calling poll 3 times have any advantages if either call 1 or 2 must be true and call 3, and the following ones must be true as well?

darklajid · on June 6, 2016

That's a humorous translation of the first code. He claims (I'm no expert either) that xtermAppPending() is poll(), GetBytesAvailable() is poll() and then xevents is invoked, which calls xtermAppPending() again in a loop.

The code you're confused about is basically combining the two previous methods (hence: 'manual inlining' with a wink) and replacing both xtermAppPending() and GetBytesAvailable() with poll() to make the problem stand out.

adamnemecek · on June 6, 2016

The article is about how this repeated polling is wrong but the fact that it's happening is not transparent from just looking at the code.

svantana · on June 6, 2016

That was the whole point, that convoluted source code can hide inefficiencies like this.

Bino · on June 6, 2016

insightful, however most people should probably look into libraries when doing non-blocking io, which should remove these kinds of caveats

ksherlock · on June 6, 2016

Using a library doesn't remove the caveat of layers of abstractions. Quite the opposite, in fact.

EGreg · on June 6, 2016

Async structures aren't always availablein every environment. Take JS for instance.

Without ES6 and Babel, what's the way to write if statements with async?

  if (x) {
    _afterX(null, x);
  } else {
    getX(_afterX);
  }
  function _afterX(err, x) {
    if (err) return;
    // use x
  }
  // do stuff that doesn't depend on x