> 5. Never remove a name. > Removing a named schema component at any level is a ...

BWStearns · on Oct 31, 2023

This x100. The most miserable and frustrating periods of my career have been in places that never deprecated anything. You could spend hours doing something that looked quite sensible, get a working draft that seemed to work, and then be told "oh yeah, that's deprecated, that data isn't even populated anymore, those rows just _happened_ to have data in dev." Then you either start sanity checking everything before doing anything and your velocity sucks, or just keep stepping on landmines and losing whole afternoons.

Edited to add: docs can help, but only so much. Environments that cluttered also tend to have layers of docs that are equally misleading.

butlerm · on Nov 1, 2023

There are few things more important than comprehensive and up to date database documentation. Otherwise you don't even know what your data means. An organization that cannot produce documentation like that is somewhere between amateurish and waiting for a disaster to happen, unfortunately.

BWStearns · on Nov 1, 2023

I don’t really know how to screen for that before joining a company but I’d say 20% of companies seem to be at that point.

__jem · on Oct 31, 2023

Reclaiming the physical storage of an unused column is often a costly and sometimes impossible operation, which is why many legacy applications end up with the equivalent of my_column_final_final_v2. Database administration requires compromises like this sometimes in the name of uptime and data integrity. Big migrations are always inherently a little risky, and from the view of many DBAs, why even risk it just for a bit of clean up? Your schema shouldn't be totally transparent to your application's business logic anyway, so there are better places to enforce naming hygiene.

butlerm · on Nov 1, 2023

I believe in most relational databases you can just alter a column to allow null values and run a series of transactions in the background to set that column value to null, and that will quite effectively free up most of the physical overhead of the column in question. I would be reluctant to delete, rename, or even clear all the data out of a column without providing an alias though.

tootie · on Nov 1, 2023

Yeah this how you grow to the point of destruction. Your schema is half noise and nobody understands it. Then someone says you need to start from scratch.

xarope · on Nov 1, 2023

worse, I've seen supposedly unused columns be used for some other purpose, and then existing analytics fall apart.

dekhn · on Oct 31, 2023

Hyrum's law in action.

tkiolp4 · on Oct 31, 2023

But then why not addressing the real problem? If a table has a few columns which are not used or invalid or deceiving, why did we let developers introduce them? Lack of planning? Lack of peer review? Lack of talent?

I understand these “ten rules” as: as long as you have a decent codebase and decent engineers, these ten rules will make your life easier.

These rules are nothing if you are dealing with crap codebases (they can help, sure, but they will be just patches)

striking · on Oct 31, 2023

Because sometimes you make assumptions that are seemingly correct but eventually found to be wrong or based on flawed inputs from sources beyond your control.

Any system that ultimately relies on "engineers need to always do the right thing" is a flawed, brittle, ineffectual system. Because even the best engineers will make a mistake somewhere, and because you can't exclusively hire "the best" engineers.

Let's spend our time figuring out how to recover from mistakes rather than trying to pretend they'll never happen.

__jem · on Oct 31, 2023

I've worked with some databases that are 20+ years old and have outlived multiple application iterations. There's always going to be cruft in this kind of situation, it just comes with territory of supporting applications with real production users for a long time.

phtrivier · on Oct 31, 2023

Even the best team make design decision that turns out suboptimal when the requirement changes.

Also, even the best team will sometimes make mistake.

Db schemas are unforgiving.

yxhuvud · on Oct 31, 2023

Requirements change over time. Domain understanding change over tim. Business change over time. Environments change over time. Unless you are a seer with perfect precognition, most of what you have done will be invalidated over time.

Hence: make your code and data easy to change, but simple, as you cannot predict in what way it will change.

Swizec · on Oct 31, 2023

> Unless you are a seer with perfect precognition

Even then ain't nobody in a 10 person seed-stage startup got time, resources, or need to build the database you'll want to have when you're a 600 person Series C monster.

butlerm · on Nov 1, 2023

Developers without special training should generally not do database design for the sort of databases that are intended to last decades. It is a similar task to developing a complex file format that is usable twenty years later - not something to be done off the cuff, and if you want schema stability database design requires more care than most file formats.

sgarland · on Nov 1, 2023

While I agree with you, unfortunately this is unrealistic. Unless a startup happens to have someone skilled with schema design, they’re going to make do with what they can, and it’s very unlikely that they’d waste headcount on a dedicated DBA / DBRE at a young stage.

The immediate effect of that, of course, is that they also won’t try to hire any such person until the DB is a problem they can’t scale via throwing money at it.