There's often a semantic difference between null and empty.
Like, if I'm checking the result of a batch processing job, I want to know if the job finished & resulted in an empty set, or there is simply no result yet.
It's "the thing you're looking for doesn't exist" vs. "the thing you're looking for exists, but is empty."
That's fair to say I guess - but it should be accompanied by some other property to indicate a condition like that.
The API I'm thinking of though was effectively a wrapper around a database query which retrieved items.
That's what gets me about that decision you see. When you get nothing from a database, you get an empty set. The runtime was some version of .NET Framework, which by default would write an empty set as an empty array. So it wasn't even an accident - someone actually had to add extra logic to make it return null.
Perl's DBI module gets around this with a value "zero but true" aka 0E0.
It is useful in cases such as indicating an operation was successful, but zero rows were affected. Using it as a number resolves to zero, using it as a boolean results in true.
I think in the general sense under API though, a status flag is the best way to avoid confusion as you say. Null could just as easily mean 'error' as 'not yet finished'.
I've seen this happen due to system evolution. At first some entity may have a parent record or not. So when you ask for the parent, you either get it, or null.
But then the system evolves to not be many-to-one, but many-to-many. To avoid breaking old clients, they make it so the only difference is when they return multiple related records, in which case they're given in an array.
Thus you now have: null for empty, the record it self if there is only one related, and an array of records if there are more than one related.
there are still evolutionary justifications which I'm sympathetic to now and then, but mostly not. In many (most?) cases, understanding if something should be one-one or one-many is known up front. Or should be known. We have decades of examples of best practices with many common data structures. Hard coding a customer account to only ever have one address, for example - no. I don't buy that justification - a customer/address thing - for any size company/project - should just be modeled as one-many (at least). It may be slightly more 'work' up front, but that work avoids potentially major breaking changes and work later on.
I get countered with "YAGNI" now and then, but after 25+ years of doing this (and, again, decades of examples of your exact use cases already in google ready to learn from), I can usually tell when you ARE going to need it.
My example was pretty contrived– I would also expect some sort of flag representing the state of the job in that case.
My point was just that the absence of a value is distinct from an empty value. And that intentionally modeling those 2 cases separately can remove a lot of ambiguity.
Ehm you shouldn’t GET for the result, but for the job descriptor. If you really need to GET the result then you should expect an HTTP 4xx or 3xx response.
Of course there is, but generally speaking there's an infinite number of possible reasons for an empty or null value, not precisely two. That's why if you don't watch out they multiply.
Like, if I'm checking the result of a batch processing job, I want to know if the job finished & resulted in an empty set, or there is simply no result yet.
It's "the thing you're looking for doesn't exist" vs. "the thing you're looking for exists, but is empty."