No, and nor did I mention "fencing". I said "minimum viable". No other open sour...

waitwaitwhay · on March 24, 2015

> No other open source RDBMS manages those things, so calling postgres out for lacking them is a little disingenuous.

Doesnt MySQL Cluster do this pretty well?

davidgerard · on March 24, 2015

MySQL clustering is a goddamn PITA, just as MySQL replication is, regularly requiring you to take a plunger to unclog it. The quirks and bugs are myriad.

moe · on March 24, 2015

fencing

Which relates to a master/slave cluster as in how?

no other open source RDBMS out there manages those things either

Rethinkdb, Redis, Riak, ElasticSearch all do.

(them not being relational is irrelevant to replication)

Making it a runtime thing is one way of doing it.

It's the only way of doing it for modern, dynamic application layouts (the whole "cloud" thing). The stereotypical static database cluster, carefully configured by a graybeard once and then never touched again, it still exists. But it's rapidly losing ground.

Puppet clones the repo on spinning up a new machine, and I repoint the symlinks as necessary.

So how do you monitor e.g. the replication lag?

ploxiln · on March 24, 2015

Bringing up Redis, Riak, and ElasticSearch is nice...

You see, ElasticSearch gets split brain surprisingly easily, if you have a real significant load / number of hosts / host failures you will most definitely experience it, whereas with most other databases it'll only happen once every couple of years to the very largest and heaviest users. It's also just not known for data consistency, you just have to be able to refresh data from some other canonical source periodically.

Redis requires 2x memory present as is used to serve the load of requests, in order to backup. You can only safely really use half the server's memory. Because it has to fork to either save the snapshot or to compact the write log.

Riak performance is difficult to manage and keep consistent, as you replace failed nodes.

What I have experience with is a very large / heavy use of mysql at a popular website. We did manual/application sharding, and master/slave streaming replication, with automatic read-failover to slave, but only manual promotion. The slave did backups each day, and there was enough transaction log on the master and slave to spin up a new slave to either, with a backup up to 3 days old, and have it catch up. The nature of the load was that it was not possible to take a backup of the master in order to seed a new slave, because the traffic never stops and we could not afford the memory or iops overhead of backing up the master, and we really didn't do downtime if we could help it. A couple hours a year maybe.

We had trending and alerting of everything, including replication, roles, and we automated it all with something not unlike the configuration it appears you have to use for postgresql. Frankly these new-fangled auto-everything databases are untrustworthy, I've seen three of them fail (in ways you might not notice if you're not a graybeard), and the fourth has a name that really puts me off. RethinkDB? yeah, go ahead kids. I just love all this unreliable crap on the internet these days...

PS I'm 26 :)

moe · on March 24, 2015

Frankly these new-fangled auto-everything databases are untrustworthy

I'm not asking for postgres to change its ways and become "untrustworthy".

I'm merely asking that postgres wraps more automation and convenience around the things that it already does.

Replication should generally be operated from inside psql. Spinning up a slave should be a single command. Failing over should be a single command. There should be meaningful error messages, ETAs and progress bars. No rsync'ing of WAL files, no futzing with config files. No elaborate third party 'repmgr' daemons.

pyvpx · on March 24, 2015

> I'm merely asking that postgres wraps more automation and convenience around the things that it already does.

and if history is any indication, it will when it's good and ready to be convenient and automated. as a previous poster (or two) most eloquently noted, postgres tends to work from the bottom up. when the "fundamentals" have been hashed out and thoroughly tested through all or nearly all edge cases, then convenient interfaces "bubble up"

and hey, your patches are always welcomed :)

moe · on March 24, 2015

and if history is any indication, it will

Very true. I was merely bemoaning the status quo. This is a thread about a third party tool to patch up a rather severe UX flaw after all...

rosser · on March 24, 2015

fencing ... relates to a master/slave cluster as in how?

Because when you promote a slave, you also need to fence writes to the old master, otherwise you can end up in a "split-brain" scenario, where the old and new masters are both accepting writes, and nobody's replicating.

Have fun cleaning that up.

(them not being relational is irrelevant to replication)

I disagree. Their being relational imposes extra constraints upon replication. ACID?

So how do you monitor e.g. the replication lag?

Icinga invokes this: https://bucardo.org/check_postgres/check_postgres.pl.html#re...

Look, I'm rapidly losing interest in going back and forth like this. You don't seem very interested in the validity of another perspective, but rather just keep finding fault in things that don't do it your preferred way. If you want to have a discussion, I'll play. If not, let's just walk away.

moe · on March 24, 2015

Because when you promote a slave, you also need to fence writes to the master, otherwise you can end up in a "split-brain" scenario.

There is only one master, hence there can not be a "split brain". You seem to be confusing master/slave and multi-master configuration (which pg doesn't support).

I disagree. Their being relational imposes extra constraints upon replication. ACID?

Well, it doesn't.

Look, I'm rapidly losing interest in going back and forth like this.

You claim things are "easy" that you clearly haven't done before. I was merely calling you out on that, but agree the discussion is unlikely to go anywhere from here.

Edit: Oh and you retouching your comments all the time doesn't help either (the split-brain paragraph just changed...). I'm out.

atombender · on March 24, 2015

> There is only one master, hence there can not be a "split brain".

That's very naive. There are situations -- very real ones -- where the failing master doesn't know it is failing, where clients can still connect and the new master can't safely tell the old one to stop being a master. This is where fencing comes in, and other concepts like heartbeats and STONITH and backup interconnects and so on, all of which makes automatic failover complicated and error-prone.

There are good reasons that 35+ years after the invention of the relational database, only the top commercial ones implement this really cleanly (but they still require expensive dedicated DBAs, for the most part).

rosser · on March 24, 2015

You claim things are "easy" that you clearly haven't done before.

I've been a PostgreSQL DBA for a decade, and built HA setups for my last three employers. Care to reevaluate your claims of what I have or haven't done?

atombender · on March 24, 2015

> Rethinkdb, Redis, Riak, ElasticSearch all do.

None of those are RDBMSes. Since they all lack schemas, constraints, strict consistency and ACID transactions, their replication logic can be much simpler. There are some very good reasons that true multimaster replication is nearly unknown among relational databases.

moe · on March 24, 2015

Since they all lack schemas, constraints, strict consistency and ACID transactions, their replication logic can be much simpler.

Why would an RDBMS txlog be somehow harder to ship than any other db log?

And how is that an excuse for crappy usability in any case?

atombender · on March 24, 2015

Conflict management and consistency in general.

The products you mention have no interdependencies within relations or between relations, such as an RDBMS has, which means that they can just push streams of individual K/V pairs without having to maintain consistency between them.

Most of those projects only offer temporal consistency (last write wins) and the CRDTs offered (eg., Riak) are exceedingly simple compared to the complexities of managing uniqueness constraints or foreign key constraints across transactions.

Hinted handoff -- a common technique that mitigates split-brain data loss and increases availability in a shared scenario -- simply wouldn't work in a relational scenario because nobody would have any idea if the changes would apply when the target shard came back.

Projects like Postgres-XC (ActorDB also looks promising) have shown that multimaster replication is clearly possible in a relational database, but also that it's clearly a difficult problem to crack, because as has been discovered the last decade or so, consistency is the enemy of availability and partition tolerance.

Witness the very slow progress made by XC to retrofit multimaster replication into Postgres, and the many challenges (such as sequences) they have had to solve. Or the fact that it's pretty much alone in the open-source field in trying to solve this issue with a modern relational database. Even very mature, advanced, commercial solutions like Oracle RAC and Oracle Advanced Replication require a lot of configuration, careful design and monitoring to get right.

Note that I'm explicitly using the projects you mentioned as the benchmark here, ie. multimaster-replicated stores with automatic failover based on consensus protocols. You can build a simpler system — single master, consensus-based election, strict consistency guarantee across shards (reducing either availability or partition tolerance) — but it's still harder than you seem to think it is.

moe · on March 24, 2015

Hinted handoff multimaster

What does multi-master have to do with anything?

The subject was making the postgres master/slave replication as easy to use as the aforementioned db's.

Yes, some of them use more advanced clustering models (multi-master) than postgres and are still easier to use. That doesn't exactly make postgres look better in the comparison.

atombender · on March 24, 2015

Are you being deliberately obtuse? You explicitly mentioned four multimaster key/value stores:

  Rethinkdb, Redis, Riak, ElasticSearch all do.
  (them not being relational is irrelevant to replication)

I argued that those are not RDBMSes, and that their replication model was much simpler, something you still seem to be arguing against.

I don't disagree that Postgres's replication could be much nicer, but you're not articulating your position well. For one, bringing up these completely apples-to-orange comparisons makes no sense.

moe · on March 24, 2015

Are you being deliberately obtuse?

You're the one being obtuse here.

I introduced four databases, three of which use more complex replication models than PG, all of which are easier to admin than PG. And I'm asking PG to make their (simpler) replication as easy to use as these other databases.

If that's too confusing then just use Redis for reference and ignore the other three. Redis is plain master/slave and spinning up a slave takes a single command ("SLAVEOF foo"). That's exactly how it should be in postgresql as well. Not more, not less.

atombender · on March 24, 2015

They can and do use different replication models because their data model is explicitly wildly different from Postgres. For example, Postgres couldn't support eventual consistency even if it wanted to. You're barking up the wrong tree.

As for Redis, sure, it has a simple master/slave system and Postgres could offer a similarly simple command UI that would let you issue a command instead of doing the current, trivial amount of setup. I don't disagree, but I think it's an inconsequential problem that affects a sysadmin/devops person about five minutes each year, and an issue that ultimately only deters dilettantes. The real problem to solve is dynamic promotion/demotion of masters and slave. It's not entirely trivial [1].

[1] http://redis.io/topics/sentinel

moe · on March 24, 2015

Postgres couldn't support eventual consistency even if it wanted to. You're barking up the wrong tree.

And you're barking at a strawman. Nobody was talking about eventual consistency.

I don't disagree

You could do that less verbosely then.

The real problem to solve is dynamic promotion/demotion of masters

This whole thread is about how postgres hasn't even streamlined the manual promotion/demotion of one master in a simple master/slave pairing.