Hacker News new | past | comments | ask | show | jobs | submit login
The world of PostgreSQL wire compatibility (multiprocess.io)
106 points by EntICOnc on Feb 10, 2022 | hide | past | favorite | 18 comments



Well done. Couple of notes:

- YugabyteDB is currently on Postgres 11.2 with plans to keep upgrading.

- Cockroach is not pinned to any Postgres version so its compatibility is reliant on their implementation. Not a bad thing, just different than YB.

- Like you said Timescale is in lockstep with latest Postgres (14 at the time of this comment) https://blog.timescale.com/blog/massive-scale-for-time-serie...

- Materialize is not really focused on scaling because that is not really what it is for, https://materialize.com/blog-roadmap/

- I might consider moving Spanner to the Somewhat compatible category. Lack of stored procedures and triggers makes it not a drop in replacement for those moving from Oracle/MS SQL https://cloud.google.com/spanner/docs/reference/postgresql/o...

- For native Postgres scaling strategies, we could call attention to Crunchy Data or Percona

Very nice to lay this out and I pretty much agree with the labels and categories on here. I would be you will get some people reaching out since Postgres compatibility is something these vendors compete on.


- Materialize is not really focused on scaling because that is not really what it is for

Horizontal scalability is actually something that Materialize's underlying incremental computation engine (timely/differential dataflow) excels at. We're actively working to expose that tech in Materialize. (As in: many engineers, including me, are working on this full time; happy to answer questions!)

It is too bad that you got the wrong impression from that blog post! That blog post is about two years out of date. We're working on an updated roadmap blog post. The tl;dr is that we've since moved from step 3 on the roadmap ("Build a single-node database management system") to step 4 ("Build cloud-native elasticity and replication: Materialize Cloud").


Ok that is my bad! Sorry I was curious as to the answer to the question in the * scaling cell in the original blog post and searched 'materialized scaling' and came across that. Regardless I am actually a user of Materialize and quite like it.


Timescale DevRel here!

Great observations around how popular and useful it is for systems to have some level of Postgres line compatibility! Thanks for doing the writeup!

One thing that's not totally clear in the comparison table is that most of these implementations are just that, databases providing an abstraction through the Postgres wire protocol over a different database architecture/parser & solution.

TimescaleDB, however, is the only one listed that is built directly on Postgres as an extension. There is no abstraction layer over some other implementation of the query parser or storage layer.

When it comes to application maintenance and developer expectations at query time, not all of these are apples-to-apples comparisons, as some of the other comments have noted.


While the table doesn't say that much about intent (other than what you might infer from the SQL Compatibility, ACID and Category columns), I had hoped it was clear from the rest of the article that I was saying there is no relationship between the wire protocol and being PostgreSQL compatible in general.

> It does NOT encompass the actual query language itself, let alone database semantics.

> This doesn't mean that any PostgreSQL or MySQL query will work with them since, as mentioned previously, query language and database semantics are independent of the wire protocol.

> To reiterate, the wire protocol doesn't specify anything about the query language.

> Just because a database implements the PostgreSQL wire protocol does not mean it intends to be a drop-in replacement for PostgreSQL.

> So just looking at parser support is not enough to talk about PostgreSQL query or semantic compatibility. But it is a starting point.

But I guess it wasn't clear.


CrateDB DevRel here :)

> databases providing an abstraction through the Postgres wire protocol

I would not call it an abstraction, if one has a full parser, analyzer, planner and execution engine. It is just a common language ;)


Hey folks! Author here. Happy for questions, suggestions, flames.


> Proprietary databases like Oracle and SQL Server find value in developing their own drivers. They don't tend to publish their wire protocol.

Just to note, MS have long published the details of their wire protocol, TDS - https://docs.microsoft.com/en-us/openspecs/windows_protocols...


Thanks I should be fairer to them. Updated the post to note this.


btw I think you meant to say "worth noting" instead if "worth nothing" ;)


lol, oops. Not a freudian slip. Edited, thank you.


The [MS-TDS].pdf is also an excellent piece of documentation, gave me all the answers to write a fully compatible TDS driver.


Thanks for this write up! I've been really interested in postgres compatibility in the context of a tool I maintain (https://github.com/mergestat/mergestat) that uses SQLite. I've been looking for a way to expose the SQLite capabilities over a more commonly used wire-protocol like postgres (or mysql) so that existing BI and visualization tools can access the data.

This project is an interesting one: https://github.com/dolthub/go-mysql-server that provides a MySQL interface (wire and SQL) to arbitrary "backends" implemented in go.

It's really interesting how compatibility with existing protocols has become an important feature of new databases - there's so much existing tooling that already speaks postgres (or mysql), being able to leverage that is a huge advantage IMO


Shameless plug, but your list is missing Splitgraph [0] :)

We’ve been based on Postgres from the beginning, and although the backend is a bit more complex at this point, we’ve kept the wire protocol intact. We’re also heavily invested in FDWs, not only for federated queries (e.g. querying data at Snowflake – btw, you might enjoy our blog post on achieving a 100x speedup with aggregation pushdown [1]), but also for queries on warehoused data stored as Splitgraph images.

By keeping Postgres compatibility as our guiding constraint, we’ve been able to build a lot of functionality on top of just a few simple abstractions. The result is something akin to a magic Postgres database – you can connect dozens of live sources to it using FDW plugins, or you can ingest from hundreds of data sources using Airbyte connectors, ultimately storing the data as immutable Splitgraph images in object storage. But as far as your standard Postgres client is concerned, you’re just connected to a normal database with 40k tables on it.

As for the wire protocol, our implementation is heavily reliant on (a forked version of) PgBouncer. Basically, a query arrives, we parse it for references to tables (which look like Docker image tags), and the proxy layer performs whatever orchestration is necessary to satisfy the query. That could mean instantiating a foreign server to a saved connection, loading some data from object storage, or even lazily loading only the requisite data (we call this “layered querying” since it’s implemented similarly to AUFS). In the future, it could also mean delegating the query to a more specialized engine like Presto.

Point is, by keeping the frontend intact, we’re able to retain compatibility with all Postgres clients, but we’re free to implement the backend in more scalable or domain-specific ways. For example, we’re able to horizontally scale our query capacity by simply adding more “cache nodes” that perform the layered querying.

We are definitely all-in on the Postgres wire protocol, and all the ecosystem compatibility that comes along with it. You can read our blog for more in-depth discussions of this, but I don’t want to spam too many links here. :)

[0] https://www.splitgraph.com

[1] https://www.splitgraph.com/blog/postgresql-fdw-aggregation-p...


Postgres wire compatibility does not imply Postgres compatibility.

I recently got bit by this, I wanted to convert a mail address database from MySQL Galera to CockroachDB, because CockroachDB was just a bit easier to deal with.

Unfortunately, Cockroach only support UTF-8, and Postgres only supports LATIN1 for the database (compiled into code). I believe that there are reasons for both sides to make this choice, unfortunately it turned what I hoped would be a slam dunk into a round peg in a square hole.


What do you mean with postgres only supporting LATIN1?


Dang, I meant PostFIX only supports LATIN1.


Correction, which I can't do because it's been downvoted, POSTFIX only supports LATIN1, so I can't use Cockroach with it.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: