I'm not buying the rationale in the "async IO" section.
First, there's no need to rewrite anything to add an async interface to sqlite if you want (many clients do, whether local or remote).
The issue with sqlite's synchronous interface is leaving a thread idle while you wait for IO. But I wonder how much of an issue that really is. sqlite is designed to run very locally to the storage, and can make use of native file caching, etc, which makes IO blocking very short if not zero. You wonder if applications have enough idling sqlite threads to justify the switching. (It's not free and would be at quite a fine-grained level.)
The section does mention remote storage, but in that case you're much better off with an async client talking to compute running sqlite, sync interface and all, that is very local to the storage. AKA, a client/server database.
Also, in the WASM section, we're still talking about something that would best be implemented as a sqlite client/wrapper, with no need at all to rewrite it.
> The issue with sqlite's synchronous interface is leaving a thread idle while you wait for IO
That's not the only issue. waiting for the result of every read to be able to queue the next read is also an issue, particularily for a VFS that exists on a network (which is a target of theirs, they explicitly mention S3).
I'm not sure if they also are doing work on improving this, but I'm sure that theoretically many reads and writes that SQLite does do not depend on all previous reads and writes, which means you could queue many of them earlier. If your latency to storage is large, this can be a huge performance difference.
You can get more total IO throughput (at the cost of latency) by queueing up multiple reads and writes concurrently. You can do this with threads, but io_uring should theoretically go faster (but don't take my word for it, let's wait for benchmarks).
I'm personally interested in the potential for async bindings for Python. Making fast async wrappers for blocking APIs in Python-land is painful (although it might improve in the future with nogil).
They had been talking about making the high-level interface to sqlite async (sqlite3_step()).
With io_uring you're talking about the low-level, where blocks are actually read and written.
As-is, sqlite is agnostic on that point. It doesn't do I/O directly, but uses an OS abstraction layer, called VFS. VFS implementations for common platforms are built-in, but you can create your own that handles storage IO any way you like, including queuing reads and writes concurrently using io_uring.
So that's not a reason to rewrite sqlite.
(In fact, I'd be surprised if they weren't looking at io_uring, and, if it seemed likely to generally improve performance, to provide an option to use it, either in the existing linux-vfs or in some other way.)
> I'm personally interested in the potential for async bindings for Python.
Well, it's perfectly possible to do that with the current sqlite. It may be painful, as you say, but not even remotely at the level of pain a complete rewrite entails.
The VFS interface is synchronous, I don't see how a custom VFS could meaningfully implement asynchronous IO.
> Well, it's perfectly possible to do that with the current sqlite.
If you want to wrap a blocking API in python, with actual parallelism, you have to use multiple processes with communication between them. The main advantage of sqlite in the first place is that it's in-process, and you'd lose that.
On a single thread. There can be multiple threads.
Of course leaving a thread idle while waiting for IO isn't great. That's why I noted it at the beginning. But it doesn't seem idling threads has proven to be much of a problem with sqlite, so it wouldn't be much justification for a rewrite.
> If you want to wrap a blocking API in python, with actual parallelism, you have to use multiple processes
You can use multiple threads in the same process.
(Python has some limitations in that respect, but that's not a sqlite issue and can't be fixed by a sqlite rewrite.)
First, there's no need to rewrite anything to add an async interface to sqlite if you want (many clients do, whether local or remote).
The issue with sqlite's synchronous interface is leaving a thread idle while you wait for IO. But I wonder how much of an issue that really is. sqlite is designed to run very locally to the storage, and can make use of native file caching, etc, which makes IO blocking very short if not zero. You wonder if applications have enough idling sqlite threads to justify the switching. (It's not free and would be at quite a fine-grained level.)
The section does mention remote storage, but in that case you're much better off with an async client talking to compute running sqlite, sync interface and all, that is very local to the storage. AKA, a client/server database.
Also, in the WASM section, we're still talking about something that would best be implemented as a sqlite client/wrapper, with no need at all to rewrite it.