Hacker News new | past | comments | ask | show | jobs | submit login

A quick question I use to demonstrate the usefulness of NULL is "What color is the elephant on my desk?"

That question doesn't have an answer because there is no elephant on my desk. It can't be represented by any color, the answer needs to indicate that there is no value.




The result may be null when asking the question, but the tables representing this reality do not need a null column.

(Specifically, not returning any rows when queried, which is different from null)


Often in analytics you will create datasets which are an amalgamation of several upstream sources. For example, an Endangered Species database. You might have "wing length" and "number of eyes" as fields, and both of these need to be NULL for something like a sea urchin.

There's really no justification to avoid nulls as a principle.


Okay, but what if I `SELECT wood, lampshade, elephant_color FROM desks WHERE location = "office"`? How does it represent my mahogany office desk with a beach theme lampshade and no elephant on it, vs my teak bedroom writing desk with a pink elephant and white cream lampshade?


By normalizing into multiple tables.

It is hard to model without requirements but you probably need a table for type of desks, another for desk instances with location and another one to many linking table for items/colors.

Why? I like two lamps and no elephants. By normalizing, this new requirement is simple to represent compared to jamming more fields into the desk table.

Or if another person prefers penguins you don’t end up with penguin_color columns.

It is possible the one table design works, but the introduction of nulls is a smell that indicates maybe the design is a problem.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: