Very cool, and very much what seems to be the trend for very high performance data systems: disaggregated storage. I think of Datadog's write-up for Husky, their newest/3rd gen data engine; https://www.datadoghq.com/blog/engineering/introducing-husky...
> By writing data directly to S3-compatible object storage, Bufstream can dispense with expensive disks and nearly eliminate inter-zone networking, slashing costs dramatically.
I am curious what the situation looks like for self-cloud-ers. If you own the servers, its less clear how much advantage there would be using s3 object store versus disk attached storage services. But reciprocally, getting good at Ceph and Ceph Object Gateway (their s3 compat) then being able to use and tune that storage knowledge at the platform level makes sense, versus having a separate storage service for x, y, and z.
Still, I think there is huge potential for something like Pulsar with BookKeeper tablets of data to rise. Our object stores don't seem to be excellent at data-locality, at replicating, and being able to tap that could yield some incredible systems efficiencies & speeds that object storage has abstracted away from us, that object storage has to brute force.
I'm curious what architecture flourishes there are with your latency tuning. Is this still object store based, or something different?
> For particularly latency-sensitive or latency-tolerant workloads, operators can tune how aggressively Bufstream trades latency for cost.
The upcoming plan to keep an Iceberg store materialized, available for querying sounds so so cool. Nice. You have my attention & interest!!
> In the coming months, we'll also allow Bufstream operators to opt into storing some topics as Apache Iceberg tables. Kafka consumers will still be able to read from the topic, but SQL engines will also be able to query the data directly.
It'll be neat to see how this differs versus the connector based architecture. Whether maintenance or latency or efficiency are the major winners of this would make an excellent deep dive.
I’m here to ask the obvious question: how does Bufstream compare to Warpstream?
The Developer Voices episode on Warpstream gave me vibes that those folks really know what they’re doing, and also went into a lot of detail on how they avoid frequent S3 writes, reimplement the equivalent of the Linux disk cache, etc. How does Bufstream make the equivalent choices and tradeoffs?
I didn't catch the Developer Voices episode, but it's on my listening list now!
At a low level, I'm guessing that we do many of the same things - batching writes, aggressively colocating and caching reads, leveraging multi-part uploads, and doing all the standard tail-at-scale stuff to manage S3 latency. We have been testing with Antithesis, and we reached out to Kyle Kingsbury.
Zoomed out a bit, a few differences with Warpstream jump out:
- Directionally, we want Bufstream to _understand_ the data flowing through it. We see so many Kafka teams struggling to manage data quality and effectively govern data usage, and we think they'd be better served by a queue that can do more than shuttle bytes around. Naturally, we come at that problem with a bias toward Protobuf.
- Bufstream runs fully isolated from Buf in its default configuration, and it doesn't rely on a proprietary metadata service.
- Bufstream supports transactions and exactly-once semantics [0]. We see these modern Kafka features used often, especially with Kafka Streams and Connect. Per their docs, Warpstream doesn't support them yet.
Disaggregating storage and compute is a well-trodden path for infrastructure in the cloud, and it's past time for Kafka to join the party. I'm excited to see what shakes out of the next few years of innovation in this space.
The issue with all 'drop in replacements' for Kafka iirc is that many large scale deployments of Kafka rely on exposed but internal aspects of Kafka. Do they also replicate these aspects?
That's a good example! The topics in question are created by the Kafka Streams client library using the standard Kafka APIs, so it works just fine with Bufstream. The Kafka ecosystem takes a thick client approach to many problems, so the same answer applies to many similar Kafka-adjacent systems.
There are, of course, some internal details that Bufstream doesn't recreate. We haven't seen many cases where application logic relies directly on the double-underscore internal topics, though - especially since much of that information is also exposed via admin APIs that Bufstream _does_ implement.
> By writing data directly to S3-compatible object storage, Bufstream can dispense with expensive disks and nearly eliminate inter-zone networking, slashing costs dramatically.
I am curious what the situation looks like for self-cloud-ers. If you own the servers, its less clear how much advantage there would be using s3 object store versus disk attached storage services. But reciprocally, getting good at Ceph and Ceph Object Gateway (their s3 compat) then being able to use and tune that storage knowledge at the platform level makes sense, versus having a separate storage service for x, y, and z.
Still, I think there is huge potential for something like Pulsar with BookKeeper tablets of data to rise. Our object stores don't seem to be excellent at data-locality, at replicating, and being able to tap that could yield some incredible systems efficiencies & speeds that object storage has abstracted away from us, that object storage has to brute force.
I'm curious what architecture flourishes there are with your latency tuning. Is this still object store based, or something different?
> For particularly latency-sensitive or latency-tolerant workloads, operators can tune how aggressively Bufstream trades latency for cost.
The upcoming plan to keep an Iceberg store materialized, available for querying sounds so so cool. Nice. You have my attention & interest!!
> In the coming months, we'll also allow Bufstream operators to opt into storing some topics as Apache Iceberg tables. Kafka consumers will still be able to read from the topic, but SQL engines will also be able to query the data directly.
It'll be neat to see how this differs versus the connector based architecture. Whether maintenance or latency or efficiency are the major winners of this would make an excellent deep dive.