What's the latest in regards to multi-master (production ready) setups in the PostgreSQL world these days? (i.e without manual or application level sharding) Last I checked, the "solutions" that were out there weren't ready for prime time.
What kind of load (number of concurrent app users) your app has so it needs multi-master db?
It is quite easy to setup powerful pg master node (32 CPU threads, fast SSDs) and many slave nodes for read-only queries. That setup can handle quite a lot of transactions.
I'm genuinely curious what you mean here when you say high availability? Do you mean that during a distributed consensus failover timeframe, it is unacceptable for writes to be queued at the client?
When you say that there can't be sharding without more than one one node taking writes, I totally see where you're coming from. And I agree with you. Typically that means that there is a logical isolation between which data goes to which shard so this avoids conflicts too. Usually when people talk about "multimaster" I do not believe they're thinking of this particular scenario but I may be wrong. I'm not familiar but does postgres not shard?
Well since you were asking about this topic, I guess what I was trying to get at is what is it that you particularly were hoping to see from the community?
That's FUD, citus doesn't "drop half of sql". Anything that can be resolved to a single node works exactly like normal postgres, and the main multinode things you want (e.g. aggregations) work transparently. The actual limitations are usually easy to work around.
Can I have a two-node cluster of PostgreSQL machines where a failure in the first results in failing over to the second automatically? It's okay if I have to run something manual on the first just to bring that back up to speed and act as the new secondary.
Is that easy to setup (i.e. has a sequence of documented steps to follow)? https://wiki.postgresql.org/wiki/Multimaster says this HA setup is possible but does not describe it and I'm having trouble finding out if this is fully supported or is just a 'and maybe you could do it if you tried this'.
> Can I have a two-node cluster of PostgreSQL machines where a failure in the first results in failing over to the second automatically? It's okay if I have to run something manual on the first just to bring that back up to speed and act as the new secondary.
You can do this with repmgr from 2ndquadrant, see the documentation [0] for details on the configuration. Since repmgrd doesn't require anything other than postgres to be running, it doesn't add any new failure points.
One thing you need to make sure you do right in such a setup is fencing the failed node, pgBouncer or pgPool is a good way to go about this - in your failover script you can modify your configuration on the pgBouncer server to point at the new master before the failover takes place, this will prevent clients from talking to the failed master and prevent a split-brain.
Alternatively, you can use something like keepalived and STONITH, but I'm not comfortable relying on somehow shutting down or cutting off the other machine from the network as it is less reliable than modifying a configuration file if things are going really wrong.
You can do it, the problems start coming when you then try to automatically go back to the failed master. PostgreSQL 9.6 makes this easier, and we should see some more automated tooling around it - but in general I have learned that unless you are using shared storage automatic fail-over clusters can be hairy and generally avoided.
My long answer: After years of experience with highly-available database provision with PostgreSQL, I concluded that I don't want automatic failover. It's more practical to keep a master running and constantly replicated by slaves that can be promoted to master in migration or failure situation. However, when there is a failure situation it's so exceptional that someone really needs to understand what happened before allowing the next victim (er, database) to go on stage.
Thanks for the advice. Is there a sane setup guide here with sane defaults for the master-slave replication?
I would rather not discover afterwards that I was supposed to `ulimit` the postgres process to some CPU in order to be able to successfully kill queries that are jamming the box.
... and client libraries would need connection details for both nodes, so they can find the new master after failover.
That would be awesome, but stock Postgres only provides some of the building blocks.
For past few days I've been investigating stolon [1]. Conceptually, it makes sense to me. And so far it seems to work. But it's not simple to set up and maintain. For example, for leader election it uses etcd or consul. So initially you had one service that must not go down, --now you have two! ;-)
No, use anycast or VRRP/CARP for Postgres servers. Use a watchdog script to monitor for master failure. When it fails, the secondary/slave IP becomes live. Watchdog script should take necessary action to make it recognize it is master/writable. Clients don't have to know what happened.
Bringing the master back live and making it primary is a more difficult task. I would not want to automate this anyway as it would be too easy to shoot your foot off.
Personally I think pgBouncer is a better solution for handling the switch to the correct master during a failure. I use keepalived (VRRP) for managing my HAProxy machines and while it's certainly reliable there's more issues with it that you have to be aware of than simply flipping the connection string in pgBouncer.
By the way you don't need notify to hook into the Replication stream. It is possible to get access to the changes directly from the jdbc driver for example. Logical Replication is a big deal!