The relationships were enforced in-code, so the safety existed there instead of by the database. Even where I work now with sharding and such, we don’t have any foreign key constraints yet serve billions of requests. Code can provide just as much safety as the database, you just have to be smart enough to not try and override it.
> Code can provide just as much safety as the database
No, database constraints declaratively specify invariants that have to be valid for the entirety of your data, no matter when it was entered.
Checks and constraints in code only affect data that is currently being modified while that version of the code is live. Unless you run regular "data cleanup" routines on all your data, your code will not be able to fix existing data or even tell you that it needs to be fixed. If you deployed a bug to production that caused data corruption, you deploying a fix for it will not undo the effects of the bug on already created data.
Now, a new database constraint will also not fix previously invalid data, but it will alert you to the problem and allow you to fix the data, after which it will then be able to confirm to you that the data is in correct shape and will be continue to be so in the future, even if you accidentally deploy a bug wherein the code forgets to set a reference properly.
I'm fine with people claiming that, for their specific use case or situation, database constraints are not worth it (e.g. for scalability reasons), but it seems disingenuous to me to claim that code can provide "just as much safety".
> Code can provide just as much safety as the database, you just have to be smart enough to not try and override it.
This statement is absolutely false. Some application code that random developers write will never be a match for constraints in a mature, well tested database.
> Some application code that random developers write will never be a match for constraints in a mature, well tested database.
Well, lucky for you, most places don't hire random developers, only qualified ones. /s
Where I work has billions of database tables sharded all over the place. There simply isn't a way to define a constraint except via code. We seem to get along just fine with code reviews and documentation.
I’ve never seen a database with just one writer. I wouldn’t even bet on all the writers using the same language, much less reusing a single piece of code implementing constraints that never have changed and never will.
Can you tell me what the performance impact is with code logic and extra dB calls versus built-in constraint logic? Because I am fairly convinced that most modern DBs have almost zero impact from relationship constraints. Those are heavily optimised. I am curious to your study and want to know more.
The problem with sharding is that you can't join, as most databases don't support "distributed joins" so relationship constraints don't even make sense. Thus it has to be done in code and often requires multiple queries to various servers to get the full picture. Most of this can be optimized by denormalizing as much as possible for common queries and leaving normalized records for less common queries.
There's a performance impact, for sure, but there's really not much that can be done without switching to some database technology that does sharding for you and can handle distributed joins (rethinkdb comes to mind, but I'm not sure if that is maintained any more). But then you have to rewrite all your queries and logic to handle this new technology and that probably isn't worth it. Even if you were to use something like cockroachdb and postgres, cockroachdb doesn't support some features of postgres (like some triggers and other things), so you'll probably still have to rewrite some things.
> I am fairly convinced that most modern DBs have almost zero impact from relationship constraints
It cannot fundamentally be zero impact, the database needs to guarantee that the referred record actually exists at the time the transaction commits. Postgres takes a lock on all referenced foreign keys when inserting/updating rows, this is definitely not completely free.
Found this out when a plain insert statement caused a transaction deadlock under production load, fun times.