If you have an index on the uuid anyways having a separate big serial field for ...

rezonant · on July 6, 2024

As mentioned elsewhere, it ensures the ability to perform resumable and consistent batching queries across the data set without missing records.

Ordering over an insertion timestamp is not enough if two records may have the same timestamp: You may miss a record (or visit a record twice) across multiple queries.

disneycember · on July 6, 2024

This is solved sorting by timestamp first then by random PK UUID. Don't think a little simpler batch queries justify leaking time and quantity information or complexity of handling two types of IDs.

rezonant · on July 6, 2024

You wouldn't expose the numeric IDs publically, and ideally you'd use your database's automatic ID selection to avoid any complexity.

The UUID sorting works in the common case, but if you happen to end your batch near the current time, you still run the risk of losing a few records if the insert frequency is sufficiently high. Admittedly this is only a problem when you are batching through all the way to current insertions.

j45 · on July 6, 2024

I agree with not baking more intelligence into a piece of data than needed, especially an index.