Hacker News new | past | comments | ask | show | jobs | submit login

Pebble does not currently implement MultiGet as CockroachDB did not use RocksDB's MultiGet operation. CockroachDB can use multiple nodes to process a query by decomposing SQL queries along data boundaries and shipping the query parts to be executed next to the data. CockroachDB can't directly use MultiGet because that API was not compatible with how CockroachDB reads keys.

RocksDB MultiGet is interesting. Parallelism is achieved by using new IO interfaces (io_uring), not by using threads. That approach seems right to me. See https://github.com/facebook/rocksdb/wiki/MultiGet-Performanc.... My understanding is that io_uring support is still a work in progress. We experimented at one point with using goroutines in Pebble to parallelize lookups, but doing so was strictly worse for performance. Experimenting with io_uring is something we'd like to do.




Indeed the conceptual fork point mentioned is RocksDB 6.2.1 which came before those features. The problem with RocksDB is that one thread only makes one request at a time. I should've phrased my question more succinctly: Is Pebble/CockroachDB capable of saturating the backplane with requests in parallel? Does it multiplex a single query by dispatching smaller requests to a thread-pool?


> Is Pebble/CockroachDB capable of saturating the backplane with requests in parallel?

Yes.

> Does it multiplex a single query by dispatching smaller requests to a thread-pool?

Yes, though it depends on the query. Trivial queries (i.e. single-row lookups) are executed serially as that is the fastest way to execute them. Complex queries are decomposed along data boundaries and the query parts are executed in parallel next to where the data is located.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: