> The primary key/clustered index is how the data is physically stored and sorted, in a btree.
This is only true for RDBMS with a clustered index, like MySQL (assuming InnoDB) and MSSQL. Postgres stores tuples in a heap. The indices are generally B+trees, yes, but there is an extra level of indirection via the visibility map.
There are a million gotchas, is the problem. You declared a partial index on foo as WHERE foo = true, but you’re querying for WHERE foo IS NOT false? I have bad news about the secret third state, WHERE foo IS NULL (not to mention the subtle differences between equality and IS).
You made a UUIDv4 as the PK, thinking that since Postgres doesn’t cluster on it, it won’t suffer the same performance issues? Visibility map just wrecked your IOPS, enjoy the 7x hit to pages.
You have an index and it’s not being used? Could be incorrect equality check (see first example), or inadequate auto-analyze on a write-heavy table leading to incorrect statistics, or any other number of things.
RDBMS are much easier to understand if you know what a B+tree looks like, yes, and I highly recommend any backend dev take the time to do so. But there is so, so much more that can go wrong.
> RDBMS are much easier to understand if you know what a B+tree looks like, yes, and I highly recommend any backend dev take the time to do so. But there is so, so much more that can go wrong.
I agree, but that stuff that goes wrong isn't (in my experience) super common. At least, not until your tables reach fairly large (as in 10s of GBs of data) sizes. Which is why I'd suggest you still employ DBAs :).
You wouldn't, for example, think about pulling out a filter index if you weren't dealing with fairly large tables where a full index might have an overly large negative impact.
On the flip side, I've seen more than a few tables where a fundamental understanding of how the data is structured would have prevented a multitude of issues like bad or missing indexes. (My favorite that is depressingly common `Create Index blah ON foo(id, thingToIndex)`)
> I agree, but that stuff that goes wrong isn't (in my experience) super common. At least, not until your tables reach fairly large (as in 10s of GBs of data) sizes. Which is why I'd suggest you still employ DBAs :).
As a DBRE who oversees tables in the 100s of GB, some in TB range, these things happen entirely too often. But then, it’s my job. And yes, I whole-heartedly agree that past a certain scale, you need DB folks.
This is only true for RDBMS with a clustered index, like MySQL (assuming InnoDB) and MSSQL. Postgres stores tuples in a heap. The indices are generally B+trees, yes, but there is an extra level of indirection via the visibility map.
There are a million gotchas, is the problem. You declared a partial index on foo as WHERE foo = true, but you’re querying for WHERE foo IS NOT false? I have bad news about the secret third state, WHERE foo IS NULL (not to mention the subtle differences between equality and IS).
You made a UUIDv4 as the PK, thinking that since Postgres doesn’t cluster on it, it won’t suffer the same performance issues? Visibility map just wrecked your IOPS, enjoy the 7x hit to pages.
You have an index and it’s not being used? Could be incorrect equality check (see first example), or inadequate auto-analyze on a write-heavy table leading to incorrect statistics, or any other number of things.
RDBMS are much easier to understand if you know what a B+tree looks like, yes, and I highly recommend any backend dev take the time to do so. But there is so, so much more that can go wrong.