>> If you really think about it, the only real difference between main memory bl...

kentonv · on Oct 22, 2022

> This is not a useful definition of "block".

I think what I'm saying is that calling file I/O "blocking" is also not a useful definition of "block". Because I don't really see the fundamental difference between "we have to wait for main memory to respond" and "we have to wait for disk to respond".

> this is your first access to a page of anonymous virtual memory and the kernel hadn't needed to allocate a physical page until now

And said allocation could block on all sorts of things you might not expect. Once upon a time I helped debug a problem where memory allocation would block waiting for the XFS filesystem driver to flush dirty inodes to disk. Our system generated lots of dirty inodes, and we were seeing programs randomly hang on allocation for minutes at a time.

abiloe · on Oct 22, 2022

> I think what I'm saying is that calling file I/O "blocking" is also not a useful definition of "block". Because I don't really see the fundamental difference between "we have to wait for main memory to respond" and "we have to wait for disk to respond".

In addition to the point elsewhere made that you're sort of implicitly denying the magnitude of the differences here - the latency differences are on the order of 1000s.

The other way of separating is if the OS (or some kind of software trap handler more generally) has to get involved. A main memory read to a non-faulting address doesn't involve the OS - ie it doesn't ever block. However faulting reads, calls to "disk" IO, and networking IO (ie just I/O in general) involving the OS/monitor/what have you are all potentially blocking operations.

cout · on Oct 23, 2022

It does not matter whether the OS is involved. Consider a spinlock; if it is spinning, waiting on the lock to be released, then it is blocking.

What matters is whether control returns to the process before the operation is complete. If the process waits, it is blocking (aka synchronous); if the process does not wait, it is non-blocking (and possibly also asynchronous if it checks later to see if the operation succeeded).

dahfizz · on Oct 22, 2022

> Because I don't really see the fundamental difference between "we have to wait for main memory to respond" and "we have to wait for disk to respond".

The difference, conservatively, is a factor of 1000.

There are plenty of times in software engineering where scaling 1000x will force you to reconsider your architecture.

kentonv · on Oct 23, 2022

Sure, fair enough.

To be clear I do not believe that async disk I/O is never useful, I just think that it's not as useful as people at first imagine when they learn about async I/O.

Yes, it may be 1000 times slower than memory. But there's a fundamental paradigm difference from network events, in that with network events you are waiting for some other entity to take action, with no implicit expectation that they will do so in any particular timeframe. Like, if you're waiting for connections on a listen socket, there's no telling how long you will be waiting.

Disk I/O is fundamentally different in that once you submit an operation, you expect it to complete within a reasonable, finite time period.

jandrewrogers · on Oct 23, 2022

Async disk I/O is primarily useful for implementing read-ahead / write-behind scheduling behaviors. While databases tend to be the obvious use case, the OS is often so poor at this that there are large performance improvements even for much simpler use cases that are otherwise disk I/O intensive.

wtallis · on Oct 23, 2022

I'm not sure that's the primary use case any more. Fast SSDs require high queue depths to use their full throughput, so async IO is desirable to use any time an application knows it has several IO requests to issue in parallel—one thread per request has too much overhead.

jandrewrogers · on Oct 23, 2022

Sure, but that behavior is effectively read-ahead / write-behind on your I/O buffers. That doesn't mean much more than anticipating future I/O operations before completion of that I/O is required by the code for efficient forward progress.

wtallis · on Oct 23, 2022

They're really not equivalent. Read-ahead only helps for predictable IO patterns. Issuing multiple read requests in parallel from the application is useful in a far broader range of scenarios. And for both reads and writes, being able to submit IO in batches (without having to wait for the entire batch to complete) can drastically cut down on overhead compared to submitting IOs sequentially as if they were a linear dependency chain, and makes it possible to keep the storage properly busy instead of it idly waiting on the host software to prepare and submit the next IO.

jandrewrogers · on Oct 23, 2022

All cache replacement algorithms are literally equivalent to universal sequence prediction problems, per the optimality theorem. There is no implication of sequential decisions here. When you schedule a batch of disk I/O, you are essentially front-running the sequence predictor to avoid classes of prediction failure where successful prediction would be computationally intractable (and therefore not implemented in real systems), which is expected to produce better I/O throughput on average if done competently per the same theory. There is nothing magic about this, it is in the literature, and databases in particular have explicitly exploited non-sequential scheduling to circumvent fundamental sequence prediction limits for decades. Optimally anticipating future requirements for reads and writes can be called whatever you like, but that remains the primary use case for async I/O since you can't do it with blocking I/O in a single thread.

This becomes more important as caches become larger because cache efficiency increases are strongly sublinear as a function of size, as expected. Servers are already at the scale where very deep async I/O scheduling is required for consistent throughput with high storage density, beyond what can be done via traditional buffered disk I/O architectures, async or not. It is an active area of research with some interesting ideas.