Postgresql upgrades are a piece of cake. pg_upgrade just works. Or you could jus...

dewey · on June 25, 2022

> Or you could just dump and load the database

This is sometimes not so easy if the database is huge and you can't have a downtime. The article is also mostly focused on things related to running a replicated setup which makes things a lot harder than pg_upgrade that you "just" run on your production database.

PG upgrades work well and the pg_upgrade tool works well but it's not just something you run on a Friday evening if it's bigger than a side project with 2 users.

Avamander · on June 25, 2022

If you "can't" have any downtime at all then no matter if it's Postgres of MariaDB, it's going to require a replicated setup. You can't ry on a single instance.

It's not very fair to say "upgrades on Postgres are hard" if that's practically always true in your use-case, that's all.

wutwutwutwut · on June 25, 2022

> It's not very fair to say "upgrades on Postgres are hard" if that's practically always true in your use-case, that's all.

That is just nonsense. If it's hard then it's hard. It's not easier because it's practically always hard.

What's next, it's not fair to say that flying to the moon is hard because flying to LEO is tricky? Oh my.

rmbyrro · on June 25, 2022

I guess the point they're trying to make is that phrasing as "Postgres upgrade is hard" is not fair because it gives the impression it's a weakness in Postgress, but the use case would be hard in any database.

It would be more fair to say "Zero-downtime database upgrade is hard"

tinus_hn · on June 25, 2022

I wouldn’t say that MySQL is a better database but upgrades with it are very easy: you backup and then you upgrade the packages and the new packages run the upgrade command.

djbusby · on June 25, 2022

Yea, exactly as on PG.

What about replicated and zero-down time (the current thread context)? That's also hard with MySQL.

evanelias · on June 25, 2022

It's actually not too difficult in MySQL, in terms of the mechanical steps required. MySQL's built-in replication has always been logical replication, and they've made major ease-of-use improvements in recent years, for example:

* efficient binary data clone is a single command

* starting replication with GTID positioning is a single command

* data dictionary / metadata upgrades happen automatically when you start a newer-version mysqld

The hard part of major-version upgrades in MySQL is testing your application and workload, to ensure no deprecations/removals affect you, and checking for queries with performance regressions. Tools like Percona's pt-upgrade and ProxySQL's mirroring feature help a lot with this.

minusf · on June 25, 2022

haven't used mysql in a couple of years but its replication (all methods) used to have a whole page of gotchas and idiosyncrasies with various corner cases.

They also introduced binary file format changes even with minor and patchlevel version number changes and downgrading stopped being supported. afaik in that case had to restore from backup.

it's just the exact opposite of postgres' upgrade guarantees.

evanelias · on June 25, 2022

It's hard to respond without any specifics, but in my personal view it's a pretty solid replication implementation in modern versions. Sure, there are some gotchas, but they're not common.

Realistically, every relational database has corner cases in replication. There are a lot of implementation trade-offs in logical vs physical, async vs sync, single storage engine vs pluggable, etc. Replication is inherently complex. If the corner cases are well-documented, that's a good thing.

I do totally agree the lack of downgrade support in MySQL 8 is problematic.

Postgres is a really great database, don't get me wrong. But no software is perfect, ever. Consider the silent concurrent index corruption bug in pg 14.0-14.3 for example. If something like that ever happened in MySQL, I believe the comments about it here would be much more judgemental!

yjftsjthsd-h · on June 25, 2022

I thought mysql could do replication across a major version? If so, that's big differentiator

mwarkentin · on June 25, 2022

As can Postgres: https://www.cybertec-postgresql.com/en/upgrading-postgres-ma...

wutwutwutwut · on June 25, 2022

Meh. That's an argumentation which is just a waste of time. This is a post about Postgres. Thinking about whether or not the title is "fair" towards the software itself is ridiculous and a waste of everyone's time. The title is correct and the software won't feel bad about it. Move on.

technion · on June 25, 2022

Not that I want to defend Oracle but I've been through the process of installing a new version, starting it up whilst loading the same database files currently being served by an older patch level, and seeing it fail over to the new version without dropping in flight queries. They've always done pretty well in this space in my view.

Of course, you've still got to obtain a patch. I usually see the licensing in the name of some senior manager who doesn't know how to use a keyboard and who is also the only person allowed to download patches.

Spooky23 · on June 25, 2022

You have to do that because Oracle has a nasty habit of changing what’s in the download and then changing the licensing scheme.

I’ve been burned by this before, and gone as far as to require a 3rd party audit for Oracle installations on a quarterly basis.

forinti · on June 25, 2022

Are you referring to updating one node at a time of a RAC cluster or is that some other method?

technion · on June 25, 2022

Yes that is what I'm referring to. But note Oracle has the concept of a single node cluster, so you can have a completely standalone deployment and still perform this technique hot adding a new "node" when it's the same server.

sascha_sl · on June 25, 2022

They're hard if you can't tolerate downtime, which most production users can't.

While that's true for a lot of database systems, some more recently designed are much better. [1]

[1]: https://www.cockroachlabs.com/docs/stable/upgrade-cockroach-...

BiteCode_dev · on June 25, 2022

My experience is the opposite. Most productions users can tolerate downtime.

It's usually the company leadership that can't tolerate it to have downtime.

And it's fact, they are hit an unplanned with downtime of one service or the other every month or so because of an outage. They are used to it.

So if you plan for it, explain it, and limit the scope and time of it, it usually goes very well unless you are a fortune 500, a hospital or something alike.

sascha_sl · on June 25, 2022

I think you really underestimate the amount of dependencies (documented and otherwise) that exist in even medium sized company.

I once caused a production outage in a retail company that caused all cash registers to stop working. The team that worked on that had pushed in a last minute change and didn't test if it handed going offline gracefully.

Right now I'm on call for an identity provider used by people in various timezones, including logistics workers in places that operate 24/7. Even when we do weekend upgrades, we still cause quite a bit of collateral damage. 10 minutes of time, multiply by the number of employees affected. It adds up fast.

fmajid · on June 25, 2022

I’d say a system not designed to support maintenance is not properly architected. You will need maintenance windows for an individual subsystem to perform OS upgrades, DB upgrades are no different and Postgres upgrades using pg_upgrade are relatively painless, provided you test them first in CI or against production snapshots, something the author’s company seemingly lacks.

amtamt · on June 25, 2022

"a system not designed to support maintenance is not properly architected"

Indeed, but who cares about the system design anymore? How many companies/ teams can claim honestly they even had a person with proper DBA competency, while features over features were added in sprints doing the minimum required to get the feature shipped out at the soonest possible (usually one DB schema change with feature and then one or more to add index due to performance regressions)? DBA competency is only sought when DB schema has fubar'd to an extent that frequent outages are norms or the version used is EOL'd by a few months at least. And by that time the people who "designed" the system are gone, not having documented ever why a given decision was made.

And that is how my friend...

DB upgrades are hard.

tluyben2 · on June 26, 2022

Indeed you can get away with a lot now but just paying (a lot) more money; it feels like design is no longer needed as it works. It is how I make some money; people come to me with; ‘we run some trivial online shop made by our team and with 100k uniques a month we pay $15k+ for rds, is that normal?’. So I go in and fix it. Usually it is bad or no design of the db schemas which was countered by picking heavy rds instances. Fun times considering I expect this to be the case in a very high % of all rds deployments (all that I have seen so far, so for me it’s 100%), not only the ones that asked me for help. When asked, the story is the usual ‘people are more expensive than hardware’ blah. It usually takes me around $1000 in fees to cut half the costs so that is blatantly false. Not to mention that the human costs are one off; if someone pays me 20k to bring 15k/mo to 1k/mo (which is typical), it is worth it. Unfortunately that’s not the break everything and go bankrupt way of working I guess! Still I notice that in current harder Financial Times, I do get more requests.

hoffs · on June 25, 2022

How happy are you when your global bank has downtime in middle of the day?

PhineasRex · on June 25, 2022

You wouldn't schedule downtime for the middle of the day. You schedule downtime during your period of least usage.

fmajid · on June 25, 2022

You design the complete system so it does not have dependencies on a single component, that way each subsystem can have proper maintenance schedules. It takes a lot more up-front work, along with defining what levels of service degradation are acceptable during a window, but that's the difference between a professionally run service and one run by amateurs. Look up Google's SRE methodology and their concept of error budget.

omnibrain · on June 25, 2022

You mentioned hospitals yourself. There are also alarm management solutions for alarm receiving centres, and security and emergency services. Can't really have downtime there too. Of course there are less busy hours, but an alarm can come at any moment and needs to be handled right then.

urthor · on June 25, 2022

Blue green deployment is, while it requires a professional, is not particularly complex or unusual.

Postgres is a tool. It can do many things, but the tool cannot run your entire business.

amtamt · on June 25, 2022

"Postres is a tool. It can do many things, but the tool cannot run your entire business."

this is something majority of the decision makers pretend to don't understand (and sometimes really don't).

nix23 · on June 25, 2022

zero-downtime-postgres-migration-done-right:

https://engineering.theblueground.com/blog/zero-downtime-pos...

wutwutwutwut · on June 25, 2022

Would you consider that a piece of cake? I have doubts. The page you linked to describes that if you follow the instructions on many other blogs you'll have data loss. Not a symptom of a piece of cake process

nix23 · on June 25, 2022

If you need zero downtime, you already in a field where NOTHING is a piece of cake. Not your network, not your computing, not your storage, and i haven't even talked about the human aspect of "zero downtime".

wutwutwutwut · on June 25, 2022

Alright. But the GP comment claimed upgrading PG was a piece of cake. You claim it is not a piece of cake. So it sounds like you agree that the claim that upgrading PG was a piece of cake was misleading.

msbarnett · on June 25, 2022

> the GP comment claimed upgrading PG was a piece of cake. You claim it is not a piece of cake. So it sounds like you agree that the claim that upgrading PG was a piece of cake was misleading.

GP claimed upgrading was a piece of cake, not that zero downtime upgrades are a piece of cake. The two claims aren’t interchangeable. The simple upgrade path is always available, though it may have downtime consequences you personally are unwilling to accept. And the complex upgrade path is complex for reasons that have nothing to do with PostgreSQL - it’s just as complex to do a zero downtime upgrade in any data store, because in all cases it requires logical replication.

So if anything it feels like you’re the one being misleading by acting as though GP made a more specific claim than they actually did, and insisting that the hard case is hard because of PG instead of difficulty that’s inherent to the zero downtime requirement.

wutwutwutwut · on June 26, 2022

So if I tell you that upgrading pretty much all databases is a piece of cake but not include the criteria "unless you want to keep your data" you would say that is a fair statement?

If you claim that process X is trivial one has to make some assumptions, right? Otherwise I could claim that going to the moon is trivial but leave out "assuming you have a rocket, resources, people and anything else you may require".

Claiming that something is a piece of cake as a broad statement without any details is meaningless at best.

msbarnett · on June 26, 2022

> So if I tell you that upgrading pretty much all databases is a piece of cake but not include the criteria "unless you want to keep your data" you would say that is a fair statement?

Incredibly bad-faith comparison, this.

Many, many datastore deployments can tolerate 10 minutes of downtime every 4 or 5 years when their PG install finally transitions out of support. Data loss isn’t even in the same universe of problem. It’s reasonable to talk about how easy it is to upgrade if you can tolerate a tiny bit of downtime every few years, since most people can. It’s utterly asinine to compare that to data deletion.

cersa8 · on June 25, 2022

CockroachDB and others come with huge downsides in performance and features. I keep trying to see of these databases are a good fit for me but just the sheer time it takes to import my database makes me shudder. I've used the Aws cloud hosted free tier of CockroachDB but ran into transaction timeouts on very moderate loads. There is a reason these databases aren't seeing massive uptake despite their lofty promises.

billythemaniam · on June 25, 2022

Yeah CockroachDB suffers from the same problem as PG: the defaults kinda suck for certain workloads. CockroachDB has an additional problem that their documentation isn't as detailed as PG so it can be hard to know what to tune. TiDB and Spanner defaults are much better for a wider range of workloads.

vvern · on June 25, 2022

Can you say more about workloads which suck for either of the two databases at their default settings?

dingleberry420 · on June 25, 2022

Yeah, super easy. Let me just shut down my database server for 30 hours so I can do a dump and load.

Axsuul · on June 25, 2022

Have you looked into `pg_upgrade —link`?

dingleberry420 · on June 25, 2022

I know it, and it's the way to go after making backups. It's not that hard, but not trivial either. You have to install the old & new postgres versions alongside each other, shut the database down, etc. I've done it many times, it's not hard, but I wouldn't call it a "piece of cake". And it does require some downtime.

Anyway, my comment was simply responding to the parent's:

> Or you could just dump and load the database (maybe even pipe it from one to the next).

wutwutwutwut · on June 25, 2022

So postgresql upgrades are a piece of cake if your users doesn't care if your system is available to them. Is this some paradoy of the older claim that MongoDb is a great system if your user doesn't care if their data is lost?

Avamander · on June 25, 2022

Do you actually have such an uptime requirement or do you think you have such an uptime requirement? How are you conducting it with other databases?

wutwutwutwut · on June 25, 2022

Yes, if the software I work on doesn't work for 5 minutes we have a ton of tickets from customer. We have tens of thousands of customers who pay for it. Not being able to shut down your system for an hour isn't exactly a unique requirement. Technically our SLA is 10 minutes to perform jobs but our customers don't wait that long before creating tickets.

We pay Microsoft to perform upgrades transparently for us. They have multiple servers and they shuffle transaction logs and traffic between them to ensure availability during outages and upgrades. There are 6 copies of each database in different regions. Not sure how that is relevant, though?

mrelectric · on June 25, 2022

It is relevant, reread and understand the parent comment. Get a managed pg and you'll have the same thing.

rmbyrro · on June 25, 2022

That's the point.

It's not a weakness in Postgres.

Managing upgrades in a highly available, close to 100% uptime database is hard.

If you want piece of cake, outsource this service and be happy enjoying the database features working as you please.

mrelectric · on June 26, 2022

Yes, that's what I was saying :)

wutwutwutwut · on June 25, 2022

I believe you are the one who need to reread the thread. The person I replied to claimed that "Postgresql upgrades are a piece of cake."

But it is not. It's complex to do properly, just like with most if not all other databases. Claiming that it is easy as is just ignorant and spreading such misinformation is bad.

mrelectric · on June 26, 2022

I never claimed it's easy, I just said get a managed pg, offload it to someone else and then it's easy yeah. I'm wondering the same as other commentator above me, what's the difference in other unmanaged dbs?

minusf · on June 25, 2022

even four 9's give 50 minutes a year to do maintenance work.

i am a fan of planning for maintenance. planned downtime is definitely accepted by most of the population. i mean what you gonna do?

much better a planned downtime than an outage. leaving a system to rot just because the architecture is "don't touch it while it's working" is a sure recipe for a disaster of some kind.

wutwutwutwut · on June 25, 2022

> even four 9's give 50 minutes a year to do maintenance work.

Just the security patching of the host OS will likely consume a bunch of those minutes.

Not sure what point you were trying to make apart from that. I am not advocating that people should leave system to rot.