It just buffers inserts in a queue and then does them all at once after a second has passed. This document does briefly mention it but it's such a nice feature that saves you from setting up a whole Kafka thing.
You want to be careful of async inserts as it's in-memory buffered as well and is not replicated afaik. So you can lose a bit of durability when using it (or you make sure your upstream pipeline can retry).
My read on it was that if you keep wait_for_async_insert=True then the client will still wait until there's confirmation of a successful transaction, giving the possibility to retry. It does require one to actually have a retry handler on the client side, but that's a good idea anyways.
In practice, using async insert with wait means you need batching/retries implemented on the client side or else you may find you have thousands of threads stuck in wait. If you need to replay bc you messed something up you're also SOL. I generally appreciate not having to run kafka but that comes at a cost.
Note that you will need a thread pool that’s large enough to handle 1 second (or whatever flush timeout you set) of events in your app concurrently. If you have thousands of events per second, plan accordingly.
Didn’t spend enough time to find the root cause, but at enough volume this option caused some of our inserts to take several seconds, even with awaiting for success disabled. Just moved to using an in-memory queue in our app and it fixed it.
The Kafka engine gets a little bit of hate (well, I haven't worked with ClickHouse for like 2 years now) but honestly it will go a long way.
At PostHog we were inserting something around 50M rows a day into CH with it and it was quite nice to be able to pause ingestion to a table by just detaching the table via SQL :D
I think they're still using the Kafka engine today but not sure. In our case (at least back then) we had to live with suboptimal batch sizes because we were providing near realtime analytics so Kafka was a solid fit.
It just buffers inserts in a queue and then does them all at once after a second has passed. This document does briefly mention it but it's such a nice feature that saves you from setting up a whole Kafka thing.