>“But postgres isn’t a document store!” I hear you cry. Well, no, it isn’t, but ...

scrollaway · on Dec 19, 2018

2.3m content items is tiny. So is 30m. You're at least 1-2 magnitudes away from something that will start to bother postgres. Anything before that is likely to be an index or IOPS issue.

sb8244 · on Dec 19, 2018

On that note, one thing that the article points out is that managing the MongoDB setup was a full time job.....although managing Postgres will probably be as well. It would be easy and understandable to try to make it static and not needing attention, but that is a recipe for bad times.

Having issues accessing Xmillion rows seems like something that would be caught by someone focused on performance of the database full-time.

scrollaway · on Dec 19, 2018

RDS is really good at not having to be managed tbh, at the guardian's scale. And when shit does happen they get to be able to throw money at AWS's premium support. I definitely think their move makes a ton of sense there.

mrsuprawsm · on Dec 19, 2018

Even with HSTORE/JSONB columns? I would imagine (gut feeling, no evidence) that the performance of these is worse than more traditional column types.

scrollaway · on Dec 19, 2018

Aye even so. They're definitely heavier to store, and their indexes can get quite big, though, so disk space / IOPS can be more of an issue. But still 2.3M is a drop in the ocean. (I don't know about HSTORE though…)

At my previous startup I was ingesting Hearthstone games at a rate of 1-2M / day. Before being handed off to permanent storage (s3, redshift etc) a bunch of data would get stored in a JSONB, with 14-day retention. This all ran on a 200GB t2.large instance on RDS, was our smallest instance and never really caused an issue.

tschellenbach · on Dec 19, 2018

Yeah a well configured postgres instance will usually only start to have issue with more than a billion rows.