Great writeup (including the human cost, e.g. loss / lack of sleep, which in my ...

chipbarm · on July 29, 2023

I finally made an account just to respond to this, I hope you don't find that too aggressive a move.

Null is a perfectly valid value for data, and should be treated as such. A default value (e.g. -1 for a Boolean or an empty for a string) can make your system appear to work where NULL would introduce a runtime error, but that doesn't mean your system is performing as expected, it just makes it quieter.

I know it's tempting to brush NULL under the rug, but nothing is just as valid a state for data as something, and systems should be written generally to accommodate this.

colejohnson66 · on July 29, 2023

I agree with you re: NULL being a useful thing. I personally use nullable floats in an internal company program to denote unknown values. However, the "billion-dollar mistake" everyone brings up with it has to do with NULL allowance being implicit. In languages like C/C++, Java, C#[a] (and more), any pointer could be NULL and the only way to know is to do a NULL check. In SQL (which we're talking about here), one must explicitly call out `NOT NULL` in the column's definition.[b] Rust (and other FP languages) gets a point here by having "optional" types one must use to have a NULL-like system.

[a]: C# is fixing this with "nullable reference types", but as long as it's still opt-in, it's not perfect (backwards compatibility and everything). I can still forcibly pass a NULL to a function (defined to not take a null value) with the null-forgiving operator: `null!`. This means library code still needs `ArgumentNullException.ThrowIfNull(arg)` guards everywhere, just in case the caller is stupid. One could argue this is the caller shooting themselves in the foot like `Option.unwrap_unchecked` in Rust, but "good practice" in C# (depending on who you ask) tends to dictate guard checks.

[b]: Which is kind of stupid, IMO. Why should `my_column BOOL` be able to be null in the first place? Nullable pointers I can understand, but implicitly nullable everything is a horrible idea.

chipbarm · on July 29, 2023

In SQL as you've said, nullability is explicit. It's arguably the wrong way around (i.e. NOT NULL rather than NULLABLE), but it is explicit. I feel the issue comes from the intersection of languages without explicit nullability and their data storage techs; removing that explicit typing from SQL doesn't fix the issue.

(I feel you agree with this btw, just being explicit)

kaoD · on July 29, 2023

What's the alternative, an empty string?

IMO the problem (at least in this case) is not NULL in the DB, but NULL at the application level.

If NULL is some sort of Maybe monad and you're forced to deal with it, well, you're forced to deal with it, think about it, etc.

Empty string, whatever NULL string is in your language of choice, or some sort of sigil value you invent... not much of a difference.

int_19h · on July 29, 2023

The problem is precisely that NULL is not some sort of Maybe monad, but people keep trying to use it as such. It's a lot like using NaN as a sentinel value for floats - sure, you can do that, but when something goes wrong, instead of an error at the point where the problem is, you end up dealing with a mysterious NULL somewhere way down the line. And that's the best case - the worst is that you get wrong query results because of the way NULL comparisons work.

An empty string is better as a sentinel value because at least this doesn't have the weird "unknown value" semantics that NULL does. But if you really want the same level of explicitness and safety as an option type, the theoretically proper way to do this in relational model is to put the strings themselves in a separate table in a 1:N (where N is 0 or 1) relationship with the primary table.

resonious · on July 29, 2023

It looks to me like using empty string would not have prevented the bug in the article. If their language had maybes, they might be able to prevent this bug by having a function type signature where uri is concrete. And most langs with maybes will automatically turn external nulls into maybes.

vbezhenar · on July 29, 2023

With empty string there's gotcha: Oracle treats empty string and null as the same values. If you really need sentinel value, generate UUID.

mst · on July 29, 2023

I believe that's a nasty implementation detail of Oracle and not a problem for anybody using anything else.

A generated UUID, if possible, is probably much more clear because it will only be there if it was inserted very deliberately.

But at that point honestly I'd usually prefer breaking the field out into its own table.

semiquaver · on July 29, 2023

Joins are cheap. Wide tables are often a sign that a data-model is a bit too CRUDdy. Foreign key relationships often do a much better job modeling optionality/cardinality in relational systems.

In this case, a `user_uris` table with non-nullable columns and a unique constraint on `user_id` is the first option that comes to mind.

ngc248 · on July 30, 2023

At scale ... it is better to use CQRS ... sou you have a transaction model which is fully normalized and a read nly model which is wide, if you really want to use a RDBMS.

kaoD · on July 29, 2023

For a table with a single column (plus FK), isn't LEFT JOIN isomorphic to just having a NULLABLE column (and much better at data locality)?

It might prevent error if you only rely on INNER JOIN but that's rarely the case at least for me (you often want to access the record anyways).

Much safer to deal with the NULL at the application level.

traxys · on July 29, 2023

I had a situation where I'm not really sure I could have used something else than null: I need a value in one of two columns exactly (meaning one is NULL and the other not).

You can build a constraint to check that if it's in the same table, but across tables it seems to be a bit more complex right ?

snmx999 · on July 29, 2023

You could have the first column indicate the type of value and the second column the value. If you now have your columns named "a" and "b" you could have the first column named "type" only allowing values "a" or "b" and the second column would be "value" where the actual value is stored.

tedunangst · on July 29, 2023

In this specific case, all the local users could have had URLs in the database instead of NULL (or empty string), which would have prevented them from merging.

mananaysiempre · on July 29, 2023

> What's the alternative, an empty string?

Yes! NULL is relational for “don’t know”, and SQL is (mostly, with varying degrees of success) designed to treat is as such. That’s why NULL=anything is NULL and not e.g. false (and IMO it’s a bit of a misfeature that queries that branch on a NULL don’t crash, although it’s still better than the IEEE 754 NaN=anything outright evaluating to false). If the value is funky but you do know it, then store a funky value, not NULL.

TylerE · on July 30, 2023

But a known value of empty string is a fairly commonplace thing, and I have never wanted to treat that as null.

mananaysiempre · on July 30, 2023

I’m not sure what to say here. ... Yes? If you’re referring to the Oracle behaviour where '' IS NULL, well, the rude way of putting it is that Oracle is doing a stupid. The more polite way of putting it is that Oracle is absolutely ancient and these parts probably existed long before people had the theory developed well enough to recognize which things make sense and which don’t, and now Oracle’s backward-compat-driven livelihood depends on not recognizing that it’s making no sense there. Either way, if this matters to you, you’re stuck and will have to work around this particular wart.

Tempest1981 · on July 29, 2023

A separate Boolean column?

kaoD · on July 29, 2023

Can't wait to have a NULL url and has_url TRUE.

Might or might not be based in production data I deal with on a daily basis.

btown · on July 29, 2023

> the account merge code could have been narrowly scoped

IMO automated merging/deduplication of "similar" records is one of those incredibly hard problems, with edge cases and race conditions galore, that should have a human in the loop whenever possible, and should pass data (especially data consumed asynchronously) as explicitly as possible, with numerous checks to ensure that facts haven't shifted on the ground.

In many cases, it requires the implementors to start by thinking about all the concerns and interactivity requirements that e.g. a Git-style merge conflict would have, and try to make simplifying assumptions based on the problem domain from that starting position.

Looking at the Mastodon source [0], and seeing that there's not even an explicit list of to-merge-from IDs passed from the initiator of the merge request to the asynchronous executor of the merge logic, it seems like it was only a matter of time before something like this happened.

This is not a criticism of Mastodon, by the way! I've personally written, and been bitten by, merge logic with far worse race conditions, and it's frankly incredible that a feature like this even exists for what is effectively [1] a volunteer project! But it is a cautionary tale nonetheless.

[0] https://github.com/mastodon/mastodon/blob/main/app/workers/a... (note: AGPL)

[1] https://opencollective.com/mastodon

wouldbecouldbe · on July 29, 2023

It’s not hard not to merge on null.

I agree rest of it can be hard, and would be nervous. But this should have been obvious

msla · on July 29, 2023

NULL is inevitable if you use JOINs, simply as a matter of what a JOIN is.

More deeply, NULL is inevitable because reality is messy and your database can't decline to deal with it just because it's messy. You want to model titles, with prenomials and postnomials, and then generate full salutations using that data? Well, some people don't have postnomials, at the very least, so even if you never store NULLs you're going to get them as a result of the JOIN you use to make the salutation.

You can remove the specific NULL value, but you can't remove the fact "Not Applicable"/"Unknown" is very often a valid "value" for things in reality, and a database has to deal with that.

wouldbecouldbe · on July 29, 2023

Even if there is null, the merge function should have done some sort of null / istruthy check. That’s unbelievable.