It is done, the PostgreSQL community rocks

eo3x0 · on April 5, 2017

What's the latest in regards to multi-master (production ready) setups in the PostgreSQL world these days? (i.e without manual or application level sharding) Last I checked, the "solutions" that were out there weren't ready for prime time.

rpedela · on April 5, 2017

They are working hard to get logical replication in for PG 10 (next major release), but there are still a lot of open items: https://wiki.postgresql.org/wiki/PostgreSQL_10_Open_Items

I don't know if logical replication = multi-master but I know it is a major piece of the puzzle.

dgregd · on April 5, 2017

What kind of load (number of concurrent app users) your app has so it needs multi-master db?

It is quite easy to setup powerful pg master node (32 CPU threads, fast SSDs) and many slave nodes for read-only queries. That setup can handle quite a lot of transactions.

redwood · on April 5, 2017

I'm always curious when people ask about "multi-master" what exsctly they mean and with which tradeoffs. What use case do you envision?

user5994461 · on April 5, 2017

It means you can write to any node.

spacemanmatt · on April 5, 2017

That is what people think they want when they do not understand the tradeoffs involved, typically.

user5994461 · on April 5, 2017

There can't be high availability if there is only ever one node that accepts writes.

There can't be sharding if there is only ever one node that performs writes.

redwood · on April 6, 2017

I'm genuinely curious what you mean here when you say high availability? Do you mean that during a distributed consensus failover timeframe, it is unacceptable for writes to be queued at the client?

user5994461 · on April 6, 2017

It means that the system is still available when one node is dead.

redwood · on April 6, 2017

When you say that there can't be sharding without more than one one node taking writes, I totally see where you're coming from. And I agree with you. Typically that means that there is a logical isolation between which data goes to which shard so this avoids conflicts too. Usually when people talk about "multimaster" I do not believe they're thinking of this particular scenario but I may be wrong. I'm not familiar but does postgres not shard?

user5994461 · on April 6, 2017

There are two cases of multi-master.

- All nodes have all data (no sharding).

- No nodes have all data (sharding).

kermatt · on April 5, 2017

Multi-master works wonderfully until it does not. Then you have a terrible mess of inconsistent data.

redwood · on April 6, 2017

Okay, but in your use case, what should happen when two nodes get conflicting writes at approximately the same time?

user5994461 · on April 6, 2017

Depends on the system. Read the documentation.

redwood · on April 7, 2017

Well since you were asking about this topic, I guess what I was trying to get at is what is it that you particularly were hoping to see from the community?

user5994461 · on April 5, 2017

SQL doesn't work well in a distributed setup. Give up on SQL and it gets easier.

The most popular hack for PostgreSQL is citusdb https://www.citusdata.com/ , of course it comes with many limitations and drops half of SQL

kod · on April 6, 2017

That's FUD, citus doesn't "drop half of sql". Anything that can be resolved to a single node works exactly like normal postgres, and the main multinode things you want (e.g. aggregations) work transparently. The actual limitations are usually easy to work around.

https://docs.citusdata.com/en/v6.1/reference/sql_workarounds...

user5994461 · on April 8, 2017

I guess they improved significantly since my last work around it.

kylequest · on April 5, 2017

Here's a couple of blog posts covering the conference...

Day 0 and 1: https://cloudlock.engineering/pgconf-2017-days-0-1-c897dd90c...

Day 2: https://cloudlock.engineering/pgconf-2017-day-2-8bd7e93404eb

Day 3: https://cloudlock.engineering/pgconf-2017-day-3-54e56ad8eaca

Dowwie · on April 4, 2017

Postgresql and its community keep getting better!

tmaly · on April 5, 2017

I love using Postgresql, one of the best pieces of software out there.

Does anyone know if there videos of the conference available and if the are all listed in a central page?

arjie · on April 5, 2017

Can I have a two-node cluster of PostgreSQL machines where a failure in the first results in failing over to the second automatically? It's okay if I have to run something manual on the first just to bring that back up to speed and act as the new secondary.

Is that easy to setup (i.e. has a sequence of documented steps to follow)? https://wiki.postgresql.org/wiki/Multimaster says this HA setup is possible but does not describe it and I'm having trouble finding out if this is fully supported or is just a 'and maybe you could do it if you tried this'.

snuxoll · on April 5, 2017

> Can I have a two-node cluster of PostgreSQL machines where a failure in the first results in failing over to the second automatically? It's okay if I have to run something manual on the first just to bring that back up to speed and act as the new secondary.

You can do this with repmgr from 2ndquadrant, see the documentation [0] for details on the configuration. Since repmgrd doesn't require anything other than postgres to be running, it doesn't add any new failure points.

One thing you need to make sure you do right in such a setup is fencing the failed node, pgBouncer or pgPool is a good way to go about this - in your failover script you can modify your configuration on the pgBouncer server to point at the new master before the failover takes place, this will prevent clients from talking to the failed master and prevent a split-brain.

Alternatively, you can use something like keepalived and STONITH, but I'm not comfortable relying on somehow shutting down or cutting off the other machine from the network as it is less reliable than modifying a configuration file if things are going really wrong.

[0]: https://github.com/2ndQuadrant/repmgr

arjie · on April 5, 2017

Ah ha, thank you for your advice and summary of the current state of things. It looks like I shouldn't be attempting what I said.

snuxoll · on April 5, 2017

You can do it, the problems start coming when you then try to automatically go back to the failed master. PostgreSQL 9.6 makes this easier, and we should see some more automated tooling around it - but in general I have learned that unless you are using shared storage automatic fail-over clusters can be hairy and generally avoided.

spacemanmatt · on April 5, 2017

My short answer: Yes, but...

My long answer: After years of experience with highly-available database provision with PostgreSQL, I concluded that I don't want automatic failover. It's more practical to keep a master running and constantly replicated by slaves that can be promoted to master in migration or failure situation. However, when there is a failure situation it's so exceptional that someone really needs to understand what happened before allowing the next victim (er, database) to go on stage.

arjie · on April 5, 2017

Thanks for the advice. Is there a sane setup guide here with sane defaults for the master-slave replication?

I would rather not discover afterwards that I was supposed to `ulimit` the postgres process to some CPU in order to be able to successfully kill queries that are jamming the box.

cuu508 · on April 5, 2017

... and client libraries would need connection details for both nodes, so they can find the new master after failover.

That would be awesome, but stock Postgres only provides some of the building blocks.

For past few days I've been investigating stolon [1]. Conceptually, it makes sense to me. And so far it seems to work. But it's not simple to set up and maintain. For example, for leader election it uses etcd or consul. So initially you had one service that must not go down, --now you have two! ;-)

[1] https://github.com/sorintlab/stolon

feld · on April 5, 2017

No, use anycast or VRRP/CARP for Postgres servers. Use a watchdog script to monitor for master failure. When it fails, the secondary/slave IP becomes live. Watchdog script should take necessary action to make it recognize it is master/writable. Clients don't have to know what happened.

Bringing the master back live and making it primary is a more difficult task. I would not want to automate this anyway as it would be too easy to shoot your foot off.

snuxoll · on April 5, 2017

Personally I think pgBouncer is a better solution for handling the switch to the correct master during a failure. I use keepalived (VRRP) for managing my HAProxy machines and while it's certainly reliable there's more issues with it that you have to be aware of than simply flipping the connection string in pgBouncer.

mooneater · on April 5, 2017

I cant wait to see the videos.

Tostino · on April 5, 2017

Agreed, I can't afford the time away to attend the conference, but the videos give a ton of insight, so it's not the worst substitute.

merb · on April 5, 2017

PostgreSQL is great, sadly that NOTIFY does not work with replication. Else we could've use PostgreSQL in many more ways :/

zerg2k · on April 5, 2017

By the way you don't need notify to hook into the Replication stream. It is possible to get access to the changes directly from the jdbc driver for example. Logical Replication is a big deal!

merb · on April 5, 2017

well we are a small shop and use notify/listen as a small pubsub server.

digitalzombie · on April 5, 2017

> We had a record number of attendees, a record number of sponsors and a record number of talks. The conference rocked.

Can we have data on these?

Like how much is a record? It could be 1 person more than previous convention.

Having data we can have a better understanding such as growth rate and such. It also let us appreciate the growth and momentum much better.