For me one of the most practical benefits of singular table names is: not having to think about the primary/foreign key names, because you can always just assume that it's "[table name]_id".
If you have a "users" table, you either have to name the column "users_id", which is incorrect (because per row it's the id of a single user), or you have to use mental cycles to de-pluralize the name (or even worse, try to remember whether you called it "users_id" or "user_id" on a case by case basis).
Maybe I’m weird or thinking about this wrong, but it makes sense to me to pluralize the table “users”, because it is a collection of users, and when I’m naming the key it is the “user ID” because it is the ID that corresponds to a single user. I don’t have to do any mental gymnastics for this to make sense, it just seems natural to me. But again, maybe I’m thinking about this incorrectly and just don’t realize it.
I believe GP is talking about how you refer to this column in another table, where "id" is not an option.
In the "widgets" table, you need to refer to either "the id for the user" or "the id to use in the users table" so both user_id and users_id are perfectly reasonable choices if the other table is called "users".
I honestly did not think about foreign key column names when I was typing, so thanks for pointing it out.
I'm always a fan of making foreign key column names more descriptive than just the table name. Widgets may have a user_id but it probably makes more sense to have an owner_id.
It’s still a good practice to refer explicitly to the foreign reference in the column name, so in your example owner_user_id would be preferable to owner_id if the foreign key refers to a record in the user table.
I disagree. Table_id is more descriptive and easier to use when joining -- no need to alias if you're selecting multiple table keys, can use USING for joins, etc
Large queries, many tables, many FKs will make it easier to see why these conventions exist. Patterns like "users JOIN things USING (user_id) JOIN stuff USING (user_id, thing_id) ...".
Then you know exactly what that column is there for. It was intended to be a foreign key for the users table.
I think "orders.user_id" is fine too of course, I'm not going to argue against that. But isn't it just the tiniest bit more ambiguous?
Edit: There's probably also always better names than just the table name with id, too.
For example orders might have more than one fk to the users table, depending on the type of site. Something like ebay might have both buyer_id and seller_id on a single order.
> The name of the class you’ll store the data into is singular (User). You therefore have a mismatch, and in ORMs (e.g., Rails) they often automatically pluralize, with the predictable result of seeing tables with names like addresss.
Laravel gets this right. Singular models, plural table names. Built in rules to pluralize, or override the defaults and add your own table name. (EDIT: Rails gets this right too)
> Strictly speaking, we’re not naming a table, we’re naming a relation.
I don't think so. We're naming the collection. The relation is the foreign key. that's why you would see `user_id`, not `users_id`. EDIT: I now understand that the author is referring to the relational algebra side of things here. I don't think that changes my skepticism of the argument though, because very rarely is it useful to discuss these things in terms of the mathemetics with your coworkers. Ambiguity around whether you're talking about a User (relation) and User (tuple representing a user) and a User relation(ship to other data) makes this go sideways in my opinion.
Rails seemingly handles it the same way as Laravel and it all makes sense very quickly.
The argument here that you would end up with 'addresss' is silly as you can also quickly handle that with a new inflection and it rarely even happens in practice.
Also, to be pedantic, rails handles this just fine by default:
irb(main):001:0> "address".pluralize
=> "addresses"
> Singular models, plural table names. Built in rules to pluralize, or override the defaults and add your own table name.
Why spend a single cycle of computing time on this? There's nothing objectively necessary about it, singular-only is entirely adequate to convey the important semantics. Every single symbol in any code devoted to this conversion unnecessary surface area / computation time that contributes nothing to the problem domain.
> I now understand that the author is referring to the relational algebra side of things here. I don't think that changes my skepticism of the argument though, because very rarely is it useful to discuss these things in terms of the mathemetics with your coworkers.
You don't even have to go that far. It's the user table. The order table. People know what an order is, and what a table in a database is; the semantics of db-table-ness conveys that it's a collection and the nature of the collection with so much more precision than english plurals that it's more likely obscuring to use those than contributing to understanding... much like a lot of natural language conventions, the emulation of which is really the only reason anybody does this.
But also, when we're talking about databases, relational algebra should no more be weird or inadmissible than boolean logic concepts are to coding.
Ecto gets this righter in my opinion. You always manually specify in your schema module. It's a trivial amount of work and in return you never have to worry about incorrect magic.
defmodule User do
use Ecto.Schema
schema "users" do # table name
field :name, :string
field :age, :integer, default: 0
field :password, :string, redact: true
has_many :posts, Post
end
end
You'd usually generate a schema and matching migration with `mix phx.gen.schema Accounts.User users name:string ...`
Forcing users to specify manually means that you'll often end up with a mix of plural and singular names based on the whim of the developer who happens to implement that feature at that given time. I think that's strictly worse than Ruby or Laravel's automatic and predictable approach.
Because in the long run its tiring to deal with inconsistencies, and when a framework can do it equally good or better than most devs and make it trivially simple to override it if you ever happen to need it, that must be better?
Aligning database theory and a modern RDBMS, you get:
1. Attribute => Column
2. Tuple => Row
3. Relation => Table
That's what he's talking about. I'm sure a debate about nomenclature will ensue.
For my part, this is not of any significance. I will store a tuple of data representing a user in a `users` table. If I'd stored it in a `user` table, then that wouldn't make my life hard either, but I'd prefer the plural to represent the fact that it's the name of a relation which has multiple tuples (each of which represents data for a singular user).
> I don't think so. We're naming the collection. The relation is the foreign key. that's why you would see `user_id`, not `users_id`.
that depends on your interpretation of "strictly speaking". As a (loose) implementation of the Relational Model[1], a table is indeed a Relation. In practical terms however - particularly with an ORM in play - a table stores a collection of entities, with relationships (not relations) implemented as foreign keys as you say.
> that's why you would see `user_id`, not `users_id`
Hmm, not really. Even in the relational interpretation, each row represents a single tuple in the relation. So singular phrasing for the attribute names is appropriate.
To go a bit further that equivalence is what put the relational in relational databases. Though really a set of tuples is just a model, you can reason about relations without needing to construct an explicit model for them (which is convenient when dealing with a relation on a proper class).
I don't disagree with any of your points. Personally, I even take it a step further by pluralizing both nouns in a "join table" name. For example, a model which joins a Group and a User will be named "GroupUser" and the table will be named "groups_users".
In my mind, the ORM object is a representation of a single group and a single user, so GroupUser. The table is a representation of all member users for all groups: groups_users. That goes against Rails conventions so I have to manually define the table name on such models but I also understand it's not exactly easy to detect these irregular inflections. I'll take the trade-off.
As this is clearly something people have preferences for, the only clear solution I can see is to stick to mass nouns for table names. Refactoring everything into words like "sand", "air", "rice", "bread", and similar words may be a challenge, but you'll never have to decide on singular or plural nouns!
99% of PostgreSQL reserved keywords are singular (except `references`), so if you use plural table names, you're more likely to avoid using a reserved keyword (such as `user`)
97% of SQL-92 reserved keywords are also singular (exceptions: constraints, diagnostics, names, references, rows, values)
This. While naming tables with singular names is the formally correct choices, since the name must be seen as the name of the relation and not of a collection (that is the same name that in a programming language you would choose for a class), in practice there are too many common keywords that are singular, thus you have the choice of choosing plural and still have meaningful names, add a prefix/suffix to all tables, or remembering to quote every time you use a name. I prefer the first, since the second one adds inconvenience and the third one adds possibility of mistakes when you forget to quote it.
Some of these arguments are a little questionable imo.
"It reads well everywhere else in the SQL query:"
You can alias the table names and have it read well in all places, and this mostly only matters if you are actually writing SQL and not using an ORM directly, which the next point seems to imply you would be using.
"The name of the class you’ll store the data into is singular (User). You therefore have a mismatch, and in ORMs (e.g., Rails) they often automatically pluralize, with the predictable result of seeing tables with names like addresss."
Almost any modern "pluralize" implementation would handle this correctly. Rails would call the table `addresses`.
"Some relations are already plural. Say you have a class called UserFacts that store miscellaneous information about a user, like age and favorite color. What will you call the database table?"
You would call the table user_facts... Am I missing something?
I think the point the author is making there is that normally, each row represents a single item, but in this example, a row represents multiple facts. So normally you might have User(name=Bob, pw=**), but here you've got UserFacts(age=93, hair_color=gray). That is, UserFacts is a single instance of an object that happens to be a bundle of facts. So when you come to create the table names and use plurals, you can convert User to users, but UserFacts can't be pluralised again. (user_factses?)
Fwiw, this seems like a pretty contrived example, and I'm struggling to think of a better one. Maybe if you recorded user achievements as a single row for each user, with each achievement being its own column? But in most situations like that it would probably be best to take the extra normalisation step and just have a separate table for all the achievements in the game, that way it's a lot easier to add new ones.
But if we’re already in SQL pedantry mode, we wouldn’t be storing multiple facts in one row to begin with. “Use singular SQL table names so that using a relational db like a document store is less awkward” is a weird argument.
(The right answer, as others have already pointed out, is “whatever is consistent”)
I'd call it user_facts too, it's an 1:n relationship from users to facts. Maybe `users_facts` if it's an n:m relationship, but frankly I can never come up with good names for many-to-many tables.
There's definitely some ambiguity there, that's a good point.
I'd probably say that users_facts would be a to-many join table between users and facts, like if you had one row per fact and a multiple facts per user though that example doesn't really make sense here (could just have the FK exist in Fact and not need a join table). If UserFacts were stored in a table with multiple facts in one row about a single user, I would probably call that table user_facts.
Would probably also be fine with running across either in any codebase (or even singular table names, for that matter! as long as it's consistent :D )
I don't find the arguments either way particularly convincing. IMO just be consistent with the existing schema (if there is one), or use whatever you prefer.
Yes, this is bikeshedding at its worst. Just pick one. Preferably one that doesn't involve "clever" code that changes the names of things behind your back based on the English language's weird pluralization edge cases.
While there's a variation in how clever/complex pluralization code can be, there's only one class-table correspondence convention that doesn't end with codepaths trying some level of automatically changing the names of things behind your back... and it's singular table names.
> Some relations are already plural. Say you have a class called UserFacts that store miscellaneous information about a user like age and favorite color. What will you call the database table?
You change the class to singular and the database table is plural.
The class is used to instantiate an object that is one instance of the model. So it's singular.
The corresponding table is a collection of records (corresponding to instances) of the model. So it's plural.
It seems to me more natural to use a plural name for a collection, and a singular name for an instance. There isn't a fundamental difference between a sql/mongo table and a list/array in this sense.
Occasionally you'll have a collection of collections so you have to give it a double-plural name like "hobitses", but I have it on good authority that that is totally a legit word.
I agree, and the example could make the problematic aesthetics of plural names even more obvious:
SELECT id, name
FROM users
JOIN countries ON users.countries_id = countries.id
WHERE countries.name = 'Canada';
But the real problem is that in the ORMs and frameworks, entities are related through english-language pluralization rules, magically applied. That can be confusing to newcomers (and non-native speakers), and even experienced users, and sometimes gets badly in the way when the pluralization logic fails.
Just stick to singular, and a whole class of issues disappears.
In the example, the code for user is broken, because user is a reserved keyword. This is the whole reason we pluralize. That, and we are naming a collection of relations and not naming a relation (that’s FK). The schema of the row is the object. The table is just a collection of rows with the same schema.
The fact that the article recommends user over users shows they have no authority naming anything at the database level.
Yes but it’s non-standard. Some engines use quotes, some back ticks. If someone uses an orm then it’s gets tricky if the orm doesn’t automatically escape and you can manually specify table name.
If you are hand writing a DAL then by all means name it whatever you want. To me having a table as singular goes against what a table is.
Of those, user is the only one that must be quoted in postgresql. It doesn't show up that way in psql through, so it's a bit inconsistent.
test=# create table test (
id serial,
"user" text,
date text,
timestamp text,
name text,
version text);
CREATE TABLE
Then again, lots of things are actually valid column identifiers, if they're quoted.
test=# alter table test add column "select" text;
ALTER TABLE
test=# alter table test add column "*" text;
ALTER TABLE
test=# alter table test add column "from" text;
ALTER TABLE
test=# select "select", "*", "from" from test;
select | * | from
--------+---+------
(0 rows)
test=# \d test
Table "public.test"
Column | Type | Collation | Nullable | Default
-----------+---------+-----------+----------+----------------------------------
id | integer | | not null | nextval('test_id_seq'::regclass)
user | text | | |
date | text | | |
timestamp | text | | |
name | text | | |
version | text | | |
select | text | | |
* | text | | |
from | text | | |
This is why one should go with `$APP_User` or append `$APP` to generic sounding classes/tables. It's a slight inconvenience.
Or maybe get into the habit of wrapping your table/column names with backticks to tell the query language of choice that you're not referring about the keyword.
The conflict arises because the table name serves both to name a collection and to name a type. For types, the general convention is for the name to correspond to what a single instance of the type represents, so usually singular.
In an alternative universe, databases would maybe distinguish between the two like programming languages do. But you rarely have two tables with the same “type”, so there’s a reason why they don’t.
Arguably, table names occur more often in their “type” capacity than in their “collection” capacity. The most convincing argument for me, besides ORM name mappings becoming trivial, is the “<table>.<column>” syntax, where you want to have “employee.salary” rather than “employees.salary”.
Not that long ago, length limits on table names were an additional argument for sticking with the shorter convention.
This topic also reminds me of the convention for Git commit messages to use the imperative mood instead of past tense, which at first feels odd, but then you get used to it rather quickly.
> Say you have a class called UserFacts that store miscellaneous information about a user, like age and favorite color. What will you call the database table?
Our MIS does this in Postgres, and it's very frustrating at times. Especially when you have to remember to quote all your table names in generated queries, so as not to run up against reserved words.
I've started to do the same for C arrays, leaving the plural as the name for its paired count:
User *user = ...;
int users = ...;
for (int i = 0; i < users; i++) {
user[i] = ...;
}
This also used to be the Google convention for naming repeated fields in Protocol Buffers, so that the generated code would read like `add_user()` and not `add_users()`, though it seems like that has changed?
That doesn't make sense to me. The array is a collection of users, so `users` makes more sense to me. And what you are calling users is `totalUsers` or a variant of that.
The argument about the name of the class representing a single entity not matching doesn't hold for me - why would I want that to match anyway? What I want to match is the name of any variable/property that represents the table as a collection of entities, and I nearly always want that to be plural (though I'd be fine with something called "UserTable" - just haven't seen that convention used).
Honestly the only system I don't want to work with is one that uses a mixture of plural and singular names, or that uses incorrect pluralisations. But in 25+ years of working with various database schemas I've yet to come across a table named "addresss" or even "childs".
> Strictly speaking, we’re not naming a table, we’re naming a relation.
If that's the case, the SQL command would be "CREATE RELATION", not "CREATE TABLE". This seems like an attempt to torture the truth to get the argument the author wanted.
> Some relations are already plural. Say you have a class called UserFacts that store miscellaneous information about a user, like age and favorite color. What will you call the database table?
> The last argument above is the strongest, because it only takes one such exception to wreck an entire schema’s consistency. You won’t run into problems with singular, now or later.
If it's valid to have a named User that stores User for the sake of consistency, then it should also be valid to have a table named UsersFact that stores user's facts for the sake of consistency? Doesn't it work both ways?
1 seems questionable, since in practice we both refer to the definition of the relation as well as the collection of elements fulfilling the relation as the table.
2 is a fair point, but I'm not sure when was the last time I didn't use table aliases for anything more than a 2 liner.
3 I think that is due to a conceptual mismatch between SQL, which is basically just an implementation of relational algebra, and OOP. I'm not to well versed in this, but an ActiveRecord table class is actually more similar to a messenger or factory producing row objects.
Reading between the arguments it feels like the most meaty point is how plurals in table names play poorly with consistency in the context of auto-generated code and UIs.
One would wonder how AI guided code / ui generation could cope with naming inconsistencies on the fly and use better reading aliases in the generated output. (much like a human does). Frankly I would be shocked if we don't see systems that behave like this in the future. The age of dumb tooling seems to be close to an end.
Real talk: y'all build databases like shit. The database is your persistence layer, not your application layer. You give lip service to software engineering concepts like abstraction, DRY, SOLID, or whatever was popular right as you got out of college, but then you throw it all in the trash when you design databases. Your databases are awful. Don't feel bad: literally everyone is doing it. But also do feel bad: you could be doing so much better. It starts by realizing that software engineering principles apply to everything, including (perhaps especially) the database layer. All other database-design advice is secondary to that.
Stop building shit-ass awful databases, please. You don't need to be reading about whether to pluralize your table names or not. You need to be reading about how to even begin to do even a moderately decent job. Go back to your favorite software engineering book and replace all mentions of "class" with "database table" and see if it works. It won't, but why it doesn't will be enormously instructive.
Just to take a different perspective for anyone reading this, your database is more than _just_ your persistence layer. It can be where rules about the business live. Rather than rely on 12 different applications or code in 27 different files to maintain the proper rules, you can enforce those rules right where the data lives, guaranteeing that no matter what moron is writing code, the data will be pristine because the database has checked it before putting it in.
Edit: Just to be clear, I'm not saying it's the right thing to do, at least not in every case. But it's an option; one that's different from the parent's perspective anyhow.
Interesting. Mutation testing on database tables. Instead of testing application code, business rules--and constraints--are mutated against a test suite--
Most databases probably don't have test suites. But, it would be interesting to see if any work has been done on mutating constraints or "dirty data" injection.
This would ideally take the form of some annotated algebra, so that it is db driver agnostic, if possible.
“Don’t feel bad […] but also do feel bad” sums up… a lot of engineering growth. Everyone makes mistakes, and we don’t want junior engineers beating themselves up about it. But if you don’t honor the reasons why they’re beating themselves up—eg pride in their work, caring about any time they cost others with an outage, etc—telling them not to worry looks a lot like an insult. The trick is to help them identify where that sense of responsibility stops being productive (e.g. by driving them to go read docs, or test that edge case in the future) and is just stressing them out (e.g. blaming themselves for a rocky release when the real problem is that deployments at your company are a 20-step process with poor visibility and no undo).
Yeah. We figured this relational stuff out 40+ years ago, but the 23 year old devs are acting like it was born yesterday and they're inventing it from first principles. As the saying goes: read a book.
> The last argument above is the strongest, because it only takes one such exception to wreck an entire schema’s consistency.
The last argument:
> Some relations are already plural. Say you have a class called UserFacts that store miscellaneous information about a user, like age and favorite color. What will you call the database table?
I lean towards thinking this is a non-issue. In this specific case I'd actually call the model UserFact and the table users_facts. I'd assume it maps many users to many facts. It otherwise sounds like there's some json or "text" (as in "ADD COLUMN my_column TEXT") data that is storing words as "facts" and the table name is just kinda weird. I don't see the problem and if someone were to insist there's a problem, I'd probably suggest changing the model name.
This feels like some Sapir–Whorf type thing. The software can do anything, yet you can’t help but act as if there is some constraint because it doesn’t cleanly map to language.
A table naming thread reminds me that it has been a while since I’ve seen a schema with tables all named “tbl_whatever”. The last one I saw even had a view called tblUserGrants. I LOLed.
One of the most frustrating things in software engineering is eyeballing a decision (i.e. naming a table with a plural) and just gut-level knowing it's the wrong decision, but not quite being able to remember specific pitfalls that instilled that feeling earlier in your career.
That fourth point is the kind of thing that just cuts right through the conversation - "doing it this way does not scale and will fail, here's an example of how". Such frission in finding those and getting consensus around them, because you can feel the future pain you've avoided evaporating.
I wrote a top level comment but I really don't understand this argument. Just because the model is "multiple facts about a user" does not mean you have to jump through hoops to name the database table. It would just be user_facts, wouldn't it? That seems intuitive and straightforward to me, even if the class name is also plural.
If you have a "users" table, you either have to name the column "users_id", which is incorrect (because per row it's the id of a single user), or you have to use mental cycles to de-pluralize the name (or even worse, try to remember whether you called it "users_id" or "user_id" on a case by case basis).