I was in the never-UUID camp, but have been converted. Of course depends on how ...

sgarland · 2025-04-29T11:22:06 1745925726

I’ve never understood this argument. In every RDBMS I’m aware of, you can either get the full row you just inserted sent back (RETURNING clause in Postgres, MariaDB, and new-ish versions of SQLite), and even in MySQL, you can access the last auto-incrementing id generated from the cursor used to run the query.

tpm · 2025-04-29T12:35:05 1745930105

Now imagine that storing the complete model is the last thing you do in a business transaction. So the workflow is something like 'user enters some data, then over the course of the next minutes adds more data, the system contacts various remote services that too can take long time to respond, the user can even park the whole transaction for the day and restore it later', but you still want to have an unique ID identifying this dataset for logging etc. There is nothing you can insert at the start (it won't satisfy the constraints and is also completely useless). So you can either create a synthetic ID at the start but it won't be the real ID when you finally store the dataset. Or you can just generate an UUID anywhere anytime and it will be a real ID of the dataset forever.

sgarland · 2025-04-29T13:07:11 1745932031

So have a pending table with id, user_id, created_at, and index the latter two as a composite key. SELECT id FROM pending WHERE user_id = ? ORDER BY created_at DESC LIMIT 1.

Preferably delete the row once it's been permanently stored.

Keeping an actual transaction open for that long is asking for contention, and the idea of having data hanging around ephemerally in memory also seems like a terrible idea – what happens if the server fails, the pod dies, etc.?

tpm · 2025-04-29T13:18:44 1745932724

> So have a pending table

With UUIDs there is absolutely no need for that.

> Keeping an actual transaction open for that long is asking for contention

Yes, which is why this is not an actual db transaction, it's a business transaction as mentioned.

> and the idea of having data hanging around ephemerally

The data is not ephemeral of course. But also the mapping between business transaction and model is not 1:1, so while there is some use for the transaction ID, it can't be used to identify a particular model.

sgarland · 2025-04-29T17:30:28 1745947828

> With UUIDs there is absolutely no need for that.

Except now (assuming it’s the PK) you’ve added the additional overhead of a UUID PK, which depending on the version and your RDBMS vendor, can be massive. If it’s a non-prime column, I have far fewer issues with them.

> The data is not ephemeral of course.

I may be misunderstanding, but if it isn’t persisted to disk (which generally means a DB), then it should not be seen as durable.

tpm · 2025-04-29T18:08:37 1745950117

The whole content of the business transaction is persisted as a blob until it's "done" from the business perspective. After that, entities are saved(,updated,deleted) to their respective tables, and the blob is deleted. This gives the users great flexibility during their workflows.

Yes the overhead of UUIDs was something I mentioned already. For us it absolutely makes sense to use them, we don't anticipate to have hundreds of millions of records in our tables.

sgt · 2025-04-29T09:04:01 1745917441

I also do that for convenience. It helps a lot in many cases. In other cases I might have tables that may grow into the millions of rows (or hundreds of millions), then I'd absolutely not use UUID PK's for those particular tables. And I'd also shard them across schemas or multiple DBs.