- For native Postgres scaling strategies, we could call attention to Crunchy Data or Percona
Very nice to lay this out and I pretty much agree with the labels and categories on here. I would be you will get some people reaching out since Postgres compatibility is something these vendors compete on.
- Materialize is not really focused on scaling because that is not really what it is for
Horizontal scalability is actually something that Materialize's underlying incremental computation engine (timely/differential dataflow) excels at. We're actively working to expose that tech in Materialize. (As in: many engineers, including me, are working on this full time; happy to answer questions!)
It is too bad that you got the wrong impression from that blog post! That blog post is about two years out of date. We're working on an updated roadmap blog post. The tl;dr is that we've since moved from step 3 on the roadmap ("Build a single-node database management system") to step 4 ("Build cloud-native elasticity and replication: Materialize Cloud").
Ok that is my bad! Sorry I was curious as to the answer to the question in the * scaling cell in the original blog post and searched 'materialized scaling' and came across that. Regardless I am actually a user of Materialize and quite like it.
Great observations around how popular and useful it is for systems to have some level of Postgres line compatibility! Thanks for doing the writeup!
One thing that's not totally clear in the comparison table is that most of these implementations are just that, databases providing an abstraction through the Postgres wire protocol over a different database architecture/parser & solution.
TimescaleDB, however, is the only one listed that is built directly on Postgres as an extension. There is no abstraction layer over some other implementation of the query parser or storage layer.
When it comes to application maintenance and developer expectations at query time, not all of these are apples-to-apples comparisons, as some of the other comments have noted.
While the table doesn't say that much about intent (other than what you might infer from the SQL Compatibility, ACID and Category columns), I had hoped it was clear from the rest of the article that I was saying there is no relationship between the wire protocol and being PostgreSQL compatible in general.
> It does NOT encompass the actual query language itself, let alone database semantics.
> This doesn't mean that any PostgreSQL or MySQL query will work with them since, as mentioned previously, query language and database semantics are independent of the wire protocol.
> To reiterate, the wire protocol doesn't specify anything about the query language.
> Just because a database implements the PostgreSQL wire protocol does not mean it intends to be a drop-in replacement for PostgreSQL.
> So just looking at parser support is not enough to talk about PostgreSQL query or semantic compatibility. But it is a starting point.
Thanks for this write up! I've been really interested in postgres compatibility in the context of a tool I maintain (https://github.com/mergestat/mergestat) that uses SQLite. I've been looking for a way to expose the SQLite capabilities over a more commonly used wire-protocol like postgres (or mysql) so that existing BI and visualization tools can access the data.
This project is an interesting one: https://github.com/dolthub/go-mysql-server that provides a MySQL interface (wire and SQL) to arbitrary "backends" implemented in go.
It's really interesting how compatibility with existing protocols has become an important feature of new databases - there's so much existing tooling that already speaks postgres (or mysql), being able to leverage that is a huge advantage IMO
Shameless plug, but your list is missing Splitgraph [0] :)
We’ve been based on Postgres from the beginning, and although the backend is a bit more complex at this point, we’ve kept the wire protocol intact. We’re also heavily invested in FDWs, not only for federated queries (e.g. querying data at Snowflake – btw, you might enjoy our blog post on achieving a 100x speedup with aggregation pushdown [1]), but also for queries on warehoused data stored as Splitgraph images.
By keeping Postgres compatibility as our guiding constraint, we’ve been able to build a lot of functionality on top of just a few simple abstractions. The result is something akin to a magic Postgres database – you can connect dozens of live sources to it using FDW plugins, or you can ingest from hundreds of data sources using Airbyte connectors, ultimately storing the data as immutable Splitgraph images in object storage. But as far as your standard Postgres client is concerned, you’re just connected to a normal database with 40k tables on it.
As for the wire protocol, our implementation is heavily reliant on (a forked version of) PgBouncer. Basically, a query arrives, we parse it for references to tables (which look like Docker image tags), and the proxy layer performs whatever orchestration is necessary to satisfy the query. That could mean instantiating a foreign server to a saved connection, loading some data from object storage, or even lazily loading only the requisite data (we call this “layered querying” since it’s implemented similarly to AUFS). In the future, it could also mean delegating the query to a more specialized engine like Presto.
Point is, by keeping the frontend intact, we’re able to retain compatibility with all Postgres clients, but we’re free to implement the backend in more scalable or domain-specific ways. For example, we’re able to horizontally scale our query capacity by simply adding more “cache nodes” that perform the layered querying.
We are definitely all-in on the Postgres wire protocol, and all the ecosystem compatibility that comes along with it. You can read our blog for more in-depth discussions of this, but I don’t want to spam too many links here. :)
Postgres wire compatibility does not imply Postgres compatibility.
I recently got bit by this, I wanted to convert a mail address database from MySQL Galera to CockroachDB, because CockroachDB was just a bit easier to deal with.
Unfortunately, Cockroach only support UTF-8, and Postgres only supports LATIN1 for the database (compiled into code). I believe that there are reasons for both sides to make this choice, unfortunately it turned what I hoped would be a slam dunk into a round peg in a square hole.
- YugabyteDB is currently on Postgres 11.2 with plans to keep upgrading.
- Cockroach is not pinned to any Postgres version so its compatibility is reliant on their implementation. Not a bad thing, just different than YB.
- Like you said Timescale is in lockstep with latest Postgres (14 at the time of this comment) https://blog.timescale.com/blog/massive-scale-for-time-serie...
- Materialize is not really focused on scaling because that is not really what it is for, https://materialize.com/blog-roadmap/
- I might consider moving Spanner to the Somewhat compatible category. Lack of stored procedures and triggers makes it not a drop in replacement for those moving from Oracle/MS SQL https://cloud.google.com/spanner/docs/reference/postgresql/o...
- For native Postgres scaling strategies, we could call attention to Crunchy Data or Percona
Very nice to lay this out and I pretty much agree with the labels and categories on here. I would be you will get some people reaching out since Postgres compatibility is something these vendors compete on.