This article talks about random IDs leading to page thrashing, and MySQL b-tree indexes not handling them well. They are bad for _MySQL performance_.
It doesn't talk about NoSQL or sharding, where random IDs usually perform much better than sequential due to a lack of hot shards.
If you distribute your reads and writes at random across N machines you can get ~Nx performance vs one machine. If you make your writes sequential, you'll usually get ~1x performance because every insert goes to the same machine for this second/minute/day, then rolls to a new one for the next period. There are sharding schemes that can counter this, but they require insight into your data design before implementation.
Sure, but each individual machine still has to do the same slow random lookups, right? Generally you want some deterministic component (for caching) and some random component (for sharding). Snowflakes work well for this, since you can use the upper bits for predictable caching and the lower bits for random entropy.
I don't know if you've actually been to their blog recently, but Percona's community includes NoSQL as they maintain their own distro of MongoDB--similar in spirit to their MySQL and PostgreSQL offerings.
I never understood why people would use sequential UUIDs. That rather defeats the purpose of a UUID. If you need something sequential then just use a much more simple number
It depends on how the non-sequential part is derived.
IIRC this is often a mix of hardware network address and a portion that is either random or based on a high-precision time, so it is still unlikely that you'll see collisions between machines.
MS SQL Server's NEWSEQUENTIALID function returns something akin to a v4 UUID (fully random, aside from variant indicator bits, I'm not sure if the variant bits are respected in NEWSEQUENTIALIDs output or if it just returns a 128-bit number) but after the first is generated the rest follow in sequence until the sequence is reset (by a reboot). Assuming the variant indicators are present, there is are 122 random bits. Even if your system is up long enough to use 2^58 (2.8810^17 if you want that in decimal) IDs generated this way, you still effectively have 64-bits of randomness even if the variant bits are present. For most systems the chance of collision with sequential UUIDs is, while larger than with other types, still so small as to be inconsequential. These are "atoms in several galaxies" level numbers.
You don't want to use sequential UUIDs in place of v4 UUIDs where security matters, or course, as it is easy to work the next in sequence.
> If you need something sequential then just use a much more simple number*
Sequential isn't their only property. UUIDs, including those that increment from the start point, are intended not to collide with those generated elsewhere.
Sequential UUIDs are a compromise - giving away a small amount of collision protection in order to gain what could be a significant efficiency boost in some circumstances (DB indexes being the main one).
Each GUID generated by using NEWSEQUENTIALID is unique on that computer. GUIDs generated by using NEWSEQUENTIALID are unique across multiple computers only if the source computer has a network card.
UUIDs are integers, they just happen to be 128 bits long instead of the more common 32 or 64, and they are usually printed in a specific way that differs from how integers are usually printed. But that's just smoke and mirrors.
UUIDs are generally not guaranteed to be universally unique. The name is misleading marketing.
If you generate them randomly, then the probability of collision is small enough not to matter, but the more you introduce deterministic elements, like generating them sequentially, the more your probability of collision increases.
Right that’s why nobody starts at 0. And you don’t have to only add 1 to be ordered, no DB I’m aware of just adds 1 to sequential UUIDs. There’s always randomness/entropy involved.
Sequential UUIDs don’t start at 0. They are a 128-bit composite of two integers; a temporal component in the high order bits and a random component in the low order bits.
Interestingly, for other systems you sometimes want the exact opposite: for your key space to be distributed across indexes to balance the load (vs wanting them all to hit the same “hot” index for MySQL).
For example, Google’s Cloud Firestore is bottlenecked to 500 writes/second if your key is monotonically increasing (like a timestamp or these timestamp-based UUIDs) causing you to “hotspot” the index: https://cloud.google.com/datastore/docs/best-practices#high_...
Yeah S3 has similar performance issues where accessing objects with the same prefixes has lower throughput because they get sharded onto the same server. It's very counterintuitive when you're used to how performance works on single computers where you want to optimize for cache-locality.
This used to be the case, but it's not true any more.
For example, previously Amazon S3 performance guidelines recommended randomizing prefix naming with hashed characters to optimize performance for frequent data retrievals. You no longer have to randomize prefix naming for performance, and can use sequential date-based naming for your prefixes.
I re-read your comment and you do say "same" prefixes (I think I read it as "shared"), which AWS hasn't changed the behavior of (AFAIK). You're right that objects with the exact same prefix are still routed to the same shard (set?) and have that throughput limit.
P.S.: In a few projects I've built prefixes using timestamps, but not at the very beginning, and worried that they weren't getting sharded out. The change I linked to fixes that problem.
Is it that they truly want it distributed across the key space or distributed across the shards?
With MySQL, you're writing to a single box. Because of that, you'll want to be on hot pages. With Firestore, you're writing to many boxes. If you hotspot with Firestore, you're only writing to one box. If you write completely randomly to Firestore, maybe each box is bottlenecked to 400 wires/second (rather than 500) because there's no locality on each box. But if Google isn't charging you based on locality, there's no penalty for you since it'll scale up to more boxes (invisibly behind the scenes).
If you're only writing to one box, might as well take advantage of locality. If you have the opportunity to write to many boxes, you don't want all of the load going to one box.
However, one could imagine a distributed system where you could try and get both. Create an ID that has the ShardId to route it to the correct box and then something sequential after the ShardId. If you had 1,000 shards and 25 boxes, each box would be responsible for 40 shards so you wouldn't be perfectly sequential, but you'd kinda end up with 40 hot pages rather than randomly writing all over the place. It would also give you ample room to scale up your cluster.
So there is room to apply this technique to a distributed system as well.
Oh, one thing I'd note is that MySQL's InnoDB uses index-oriented (clustered) tables as they note in the article (which is why this is important with MySQL). PostgreSQL doesn't use index-oriented tables so new rows are (I believe) just written sequentially by default regardless of ID. It will have to update the primary-key index, but because the PK will be a lot smaller than the whole rows, you probably don't have to worry as much about randomly writing there. It's easier to keep an index hot than to keep a whole table hot.
You know, likely am wrong here. I think murmur is good because it had a good distribution and is much faster than SHA? I did find someone that compared murmur to other non crypt hashes here: https://softwareengineering.stackexchange.com/a/145633. I'd love to see this compared with sha1.
Can you elaborate on the claim about SHA? If I'm not mistaken, all these cryptographic hashes have a 50% chance of flipping any one output bit upon changing a single input bit -- is there some hidden gotcha I'm unaware of?
It’s about random distribution. If you output an array of the key space as a bitmap, it should look like white noise. SHA doesn’t look like white noise. There are certain parts of the space it rarely goes with certain classes of inputs (numeric, usernames, etc) while murmur and md5 give that white noise look.
Isn't this easily solved by supporting 128 bit keys and using UUIDs as intended, i.e. as integers and not in their string serialization? This is as nonsensical as storing IPv4 as strings instead of 32 bit integers.
UUIDs are generally generated randomly. This results in terrible insert performance into B+tree-based indexes, and terrible (= no) lookup locality with basically any database. In a large table, successive entries ends up in separate disk pages.
Even with time-based UUIDs, the time fields are ordered backward, which produces the same issue.
One way to fix this (beside the methods outlined in the article) is to create time-based UUIDs with fields sorted the other way around. Or reverse the time fields during indexing (slower, but still not as slow as an unnecessary disk read!).
Firestore has exactly the opposite problem - indexed sequential values put pressure on the "last tablet" and require the same tablet to repeatedly split. This becomes the limiting factor on insert volume. Random generated keys are better, because they spread inserts across multiple tablets (ie servers).
I don't know for certain, but I suspect DynamoDB and most other databases that can trace their origins to the bigtable whitepaper have similar behavior.
Indeed, traditional rdbms are generally optimized for patterns with high locality. High locality patterns with NoSQL and NewSQL style dbs often end up causing one node in the system to receive disproportionately high load that defeats attempts to spread operations across the cluster.
It can't. You can only perform range queries and the like on "sort keys." All sort keys are on the same partition, so has limits on storage size and throughput.
Partitioning is very explicit in DynamoDB, for better or for worse. Harder to shoot yourself in the foot, but also limits what you can do.
I can't say I've had much issue with this in postgres, but even still, this is easily worked around by having a pkey/cluster key of a monotomically increasing int--which is useful for other things such as data migrations, as you can pick defined start and end targets without worrying about new things being inserted in between.
Pg does not cluster, so the only impact of a uuid is that you’re inserting in a random location location of the index, which is not great but not the end of the world (probably).
InnoDB clusters on the PK by default, so when you're inserting a UUID you're not only inserting in the middle of the index (on average) you're also inserting in the middle of the table.
And I don't know how much the on-disk storage has been optimised for this sort of things, but if the answer is "not" and sparse pages are not really a thing, you might need to rewrite half the table in order to do so.
It's going to depend on the database. This article is primarily focused on InnoDB and goes over the read/write locality issues with a uuid.
That isn't always going to matter (or can be a very good thing). For a KV store like dynamodb a uuid is a great key because it'll distribute nicely across your shards.
It also depends a lot on your query patterns. If you have a uuid primary key but you're partitioning by some other value you may end up significantly reducing the number of pages you have to look through.
Depending on caching strategy, using random UUIDs also decreases the maximum working set size before you become I/O bound by a huge factor. Postgres e.g. relies heavily on the O/S block cache and is subject to this.
That helps with storage, but still is larger than a bigint, and doesn't help with the random distribution of data. I believe newer versions of MySQL have a data type for this.
MySQL doesn't have a UUID type, nor 128-bit int type. But MySQL 8 does provide functions UUID_TO_BIN and BIN_TO_UUID, which make it trivial to convert between the human-readable string representation of a UUID and the more efficient BINARY(16) representation.
MariaDB now has a native UUID column type as of MariaDB 10.7. This is brand new -- 10.7 had its RC release in Nov, not GA yet but very soon I'd imagine.
If you use a type1 UUID, you shouldn't have quite so random distribution, particularly if there's only a single machine generating the UUIDs (which admittedly, kind of misses the point, but alas is done all the time).
Well, you probably shouldn't cluster on a UUID, but it's not really a great idea to blindly cluster on a sequential id, either. You should cluster based on how you query the data to minimize I/O.
That's only part of it, the major problem with classic UUIDs are that writes occur randomly across the entire keyspace. Whether it's represented as a string or in binary, ordering does not change, and so the underlying indexing method must cope with these random orderings at insertion time.
At least btrees perform much worse with random insertions. I don't know how much impact it has on LSM
LSMs and fractal trees handle random inserts much better. But you still lose locality of lookups: items generated near each other time-wise are not correlated on disk.
In terms of space usage and its impact on the buffer pool, storing as 128-bit integers (if your DBMS supports it, which MySQL does not) is essentially equivalent to the bin16 case in the first "UUID insertion rates" graph. Performance still falls off a cliff, just a little bit later. As the author says, "The use of a smaller representation for the UUID values just allows more rows to fit in the buffer pool but in the long run, it doesn't really help the performance, as the random insertion order dominates."
If you look at the fastest approach the author tried in the second "UUID insertion rate" graph, it actually stores a modified UUID in a 36-character string. There is a missed optimization to store the modified UUID as 16 bytes, rather than 36, but imposing some order on the keys is the dominant improvement.
If it's ever being turned into a string and stored or transmitted it's always better to use a standard base64 encode to keep the string to 22 chars instead of the standard 36/38 chars.
You can also use base85 and go to 20, but you get into some funky chars there.
I recently read a book by Google’s head guy on API design that was specifically about designing APIs and it had a big section on what makes a good identifier and why people reach for UUIDs and why specifically it is a problem on multiple levels.
The thing that he ended up recommending however was super interesting in that I had never seen it mentioned before but it was basically to use this instead http://www.crockford.com/base32.html
I use crockford 32 to _represent_ my UUIDs, but they are obviously stored as binary. Is the only problem with UUIDs that sometimes they get stored as strings?
The only issue I've had with UUIDs is when they don't sort in increasing chronological order. RDBMSs don't appreciate high insert loads into random points in the index. Take care of that, however, and they're a treat.
This format was discussed in a HN first page post just this week:
> This alphabet, 0123456789ABCDEFGHJKMNPQRSTVWXYZ, is Douglas Crockford's Base32, chosen for human readability and being able to call it out over a phone if required.
F and S sound similar over the phone, at least on POTS landlines, as they don't carry the higher frequencies (> 4 kHz) that distinguish the S from the F. Note that cat names tend to have S sounds.
POTS = Plain old telephony service is restricted to a narrow frequency range of 300–3,300 Hz, called the voiceband, which is much less than the human hearing range of 20–20,000 Hz [from https://en.wikipedia.org/wiki/Plain_old_telephone_service ]
but if mistake is made, and you wrote down L instead of 1, and sent me in a e-mail. I, knowing that it is crockford 32, would easily deduce what mistake was made.
Yeah when he lays out the arguments for it in the book you can clearly see why it makes a huge amount of sense. The usability, the performance, the value of a checksum etc…
I'm a big fan of Crockford-flavored Base32, as in $dayjob the folks who did some of the fundamental engineering work for the product I work with decided 10+ years ago to use Crockfor-flavored Base32 as the way to expose most user-visible IDs. Things I love about it are all mentioned in Crockford's spec, but I'll restate them here:
- It's case-insensitive allowing data encoded with it to survive going through most random line-of-business applications which may have errant UPPER() or lower() calls somewhere in their depths, as well as making it easy for humans to talk about out-loud (no need to specify case makes it easy)
- Making all the "pillar" shaped characters (I, i, 1, l) equivalent and all the "donut" shaped characters (o, O, 0) equivalent means big swaths of human typing mistakes are avoided. After so much exposure to Crockford Base32, I now loath having to decipher "is it a capitol 'I'? Or a lowercase 'l'?"
Overall, it's a great way to encode any user-facing ID; just make sure you understand that just using this encoding won't stop users from realizing that Crockford Base32 encoded IDs are sequential. If you want sequential IDs to appear non-obviously sequential after encoding you'll need to use additional techniques. Assuming Crockford Base32 obscures sequential IDs is the one case where I saw someone do something they shouldn't as it directly relates to Crockford Base32.
I loved it, think it all made a ton of sense. Lots of code samples, no weird technology choices (normally they would do it all via protobufs and gRPC but they keep the same principles and just use HTTP instead and Typescript for code samples)
We use crockford tokens heavily where I work, highly recommend them. We typically use a prefix to denote the type of thing being identified, like “U-“ for User followed by a 10-15ish long crockford token. Works great.
"Bad for performance as primary keys" -> "Bad for performance as primary keys in MySQL". This isn't an issue in PostgreSQL and perhaps the lesson here is that as you scale, you need to understand more about the internals of the DB system you've chosen. This isn't limited to RDBMS as it's pretty easy to show trade-offs in choosing a NoSQL as well.
You are right, the internals are important. When I tested this, a long time ago [1] [2], I found that randomly distributed keys on PostgreSQL were indeed faster than e.g. on MySQL. Which surprised me! I still don't quite understand why. But, even with PostgreSQL, sequential (or nearly sequential) are much faster.
> Which surprised me! I still don't quite understand why.
Because MySQL (specifically innodb) will cluster by the PK by default, while Posgres will not.
Meaning in MySQL, by default, the table will be stored in PK order, so when you insert an UUID at a random location any record which sorts after it has to be moved to make room, for nothing since UUID ordering is not really relevant (at least for the current standard UUIDs).
In pg, clustering is not a property (at least for now) it's an explicit point operation, which is not performed by default. So when you insert a row whether UUID or INTEGER-keyed, it's just tacked on wherever there's free space. The UUID still has to be inserted at the right place in the index' btree which can lead to a lot of cascading changes, and it's more expensive to compute, store, and load than a simple integer (especially 32b), but it doesn't induce an arbitrarily high amount of shuffling in the storage layer of the table itself.
While the problem on the article is less of an issue in Postgres (the indexing cache locality is still there), they are still slower than the serial ones.
I don't know if you save enough problems by using them as alternative keys in Postgres for it to be faster, my guess is that just using them as primary key would be faster than a serial primary key and an UUID alternative one. Still, UUIDs are much more useful as client-facing data than as a pure database value, so I would also do that (and standardize my PK format), probably paying a performance penalty.
"Slower" is a matter of context. Like with most things, there are trade offs to allow for more concurrency. UUIDs are intended to avoid serialization with ID generation, which makes sense in a federated system. Yes, they take up more space and aren't all neatly ordered, but the trade off works out well for the right kind of systems (FlakeIDs are a better trade off for more contexts). The problem this article is describing is you're getting the worst of both worlds, because you're still serializing in the MySQL database...
this is what i do, any public facing id - url, or js/html that would show an id would be a uuid and then the backend queries are done with primary keys / integers.
Yeah. The main problem is people using them as primary keys in naïve systems like relational databases. You can't just expect a relational database to magically become a distributed system just by using UUIDs. There is a bit more work to do than that.
People use UUIDs for more reasons than just making things distributed. You can generate them client side if that's advantageous, you prevent leaking information about how many records there are in the system and prevent guessing of other potential PKs and potential unauthorized access, and there's some optimization strategies that benefit from not relying on a serial PK.
If you're obfuscating, then you should be using a DHT to map surrogate keys to internal keys that are more convenient for use in a MySQL index... or just stop using MySQL.
I've had occasional uses for it in systems that aren't distributed at all. It's a handy property of UUIDs, if UUIDs are useful for a particular use case, I'm going to use them, what they're 'supposed' to be for is irrelevant.
Somebody thought RabbitMQ did load balancing because it had three redundant nodes. Pretty much ruined the company. Definitely caused all of us to lose our equity.
Calling relational databases (HN’s preferred storage system) naive probably sounds like trolling to most people. There are also plenty of distributed relational databases. People downvote comments that sounds like trolling or flamebait.
I use UUIDs but I don’t know why they would magically make my Postgres a distributed system. I like them because the client can generate them offline.
What? Really? Naïve in this context means a general purpose solution that doesn't "know" about your use case. Have people never heard of a naïve algorithm or solution?
Distributed relational databases aren't naïve in this context. MySQL is.
> Have people never heard of a naïve algorithm or solution?
I have a fairly recent PhD in algorithms and I haven't heard naïve used this way. When I hear naïve, it usually just means "does the immediately obvious thing".
For what you're trying to say, the term I'm familiar with is "oblivious", e.g. "oblivious routing" or "oblivious local search", occasionally with modifiers such as "cache-oblivious".
This is a typo, v3 isn't random. It is generated deterministically from inputs.
> The only “repeated” value is the version, “4”, at the beginning of the 3rd field. All the other 124 bits are random.
And this is close but not quite correct. UUID v4 has a couple of other fixed bits, there are only 121-122 random ones. There are patterns in the text representation other than constant numbers. :)
I'm sure I take bigger penalty hits for less, sorry but UUID's "click" in my head and they prevent a whole slew of foot-guns. It would be one thing if auto-inc was the same as UUID but there are really annoying things with auto-inc (not knowing the id until after insert being top of mind) and also you can generate UUID's client-side/offline if needed. Yes, I know some people argue for auto-inc as primary and still use UUID for client-facing but I don't understand what that really gets you and it seems way more complicated.
I am very much not a database person, so forgive me if this is a dumb question.
I'm reading this article and it says that UUID are compared byte by byte, and seems to be indicating they're stored as string. Is that actually the case? I would have assumed that SQL supported 128 bit ints, but this seems to imply it does not.
Another question: if a column is set to char(fixed size) do the various sequel engines really not optimise to do multi word comparisons? (e.g. 8byte at a time, then 4, 2, 1, as size requires)
MySQL doesn't have a UUID type, so naive implementations use (VAR)CHAR. Those in the know use BINARY(16) and MySQL 8 now has helper functions to convert to and from hex. Apparently MariaDB will soon have a native UUID type. PostgreSQL has had them from for years.
It often doesn't matter if you use a 128bit int or a string since either way you're loading 4K/8K/16K from ssd/hdd. Database people do all sorts of silly things like storing datetimes as normalized ISO strings (2022-01-08T08:18:20) instead of using 64bit unix time. They get away with it because both the backend (durable storage) and the clients (webapps responding to a user 10ms away) are incredibly slow compared to CPUs.
That aside, Postgres stores UUIDs as a 128bit int, not as a string, so it almost certainly uses multi-word comparisons via memcmp: https://stackoverflow.com/a/29882952/703382
You can fit many more uuids encoded as 128 bit numbers than as strings in a 4K page. Hence you'll need fewer pages to store your data and that might make a difference between fetching from cache vs fetching from disk.
At least in the case of Postgres, I know from reading the source before that it stores UUIDs as 128-bit values and does optimized binary comparator operations on those values. It can export and import UUIDs as text strings but does not treat them as text internally. I'm not sure why any competently engineered database engine would do otherwise.
Also entire classes of bugs. You screwed up a JOIN, or an application-level lookup for that matter? With UUIDs you'll get no results, with sequential ints you'll get a valid but wrong result.
Or worse, the right result for the wrong reason. I've actually seen a case where creating a new entity in the application populated X records in X child tables, each with a sequential ID, and as a result all of them had the same surrogate PK. They were 1:N relationships in principle, but the software wasn't feature complete yet so the actual records were all 1:1.
Years later one of those tables finally received some extra records, and it caused a really weird bug because a query had accidentally used the PK instead of the FK as a join key, but for years it had happily chugged along because the two columns were in sync.
Yep, so we need solutions for using these practices in a performant way rather than hearing they are not 'good' and then having to explain over and over again why they are there. Our datasets that use UUIDs have not had issues with performance but of course we keep looking for ways to keep using UUIDs while improving performance. It would be better to provide solutions on how to do that and spend time to get performance on par with int keys. Like someone else said; 32 bit int keys are no good anyway for many cases, so let's go to 128 bit, optimize for that and everyone is happy.
In an environment where millions of events are processed every second, being able to uniquely identify them is a must, and temporal keys are not always an option.
Just a month ago we migrated many of the columns from integers to UUIDs (encoded as binary(16)) in several critical tables which are pretty large (Percona server, too), and so far I haven't heard about any serious performance degradation after the release.
I didn't mention that our DB setup uses sharding, and every tenant has their own DB shard (there are tens of thousands of shards). I just checked that one of the largest tenants has 2.2 mln rows in one of the affected tables, which is usually joined with 2-4 more related tables using UUIDs (another such table is 1.1 mln rows, for example), and they're on the hot code path because it's the core of the system. Maybe with sharding the difference is negligible? During code review I raised the concern that inserts and joins can become much slower after we migrate it to UUIDs, but so far my fears haven't materialized. Usually with these tables we have performance problems on the application side, not in the DB, such as ORM fetching data in a very inefficient way, or using too much RAM. Maybe it'll bite us in the long term as the tenants' shards grow in size, who knows.
A few million rows is tiny for a database, so it shouldn't be an issue in most cases. That much data will trivially fit into cache (these days, possibly even CPU cache). It probably won't become noticeable until your tables are much larger than cache memory.
base64 means the "vocabulary" used has 6 bits (2^6 = 64). Hence, to complete a full number of bytes (without using any padding), you need 4 b64-letters:
4 * 6 = 24 = 8 * 3
So, you're correct... the exact amount of bytes that it takes to represent "actual" bytes goes like this:
EDIT: As you can see, this averages with an overhead of between 33% (best case scenario where the encoding requires no padding, happens every 3 rows above) and something like 37%, decreasing with the number of bytes being encoded and approaching the minimum, 33% (e.g. to encode 1024 bytes, you need 1366 b64 digits, an overhead of 1.333984375x).
It's better than nothing, but one of the values of UUIDs for identifiers is that you can create new ones client-side while offline. These "sequential" UUIDs will fail standard UUID validation because of the byte swapping and, in my experience, when used offline-capable apps, will result in sparse clusters of sequential UUIDs that yield an unpredictable improvement over truly random UUIDs.
I think the author is missing an overall picture, eg. Event driven scenario's.
Where you don't have to check collisions with a db. He mentioned generating the pk's on remote client, but that doesn't capture the interesting bits.
You generate the newly created object with the guid.
You send it to the API/Microservices and it's generated, fire-and-forget style. And the remote client has an Id of the newly created object to do something with.
But now the remote client has an ID of something that may or may not exist the next time they try to use it depending on whether or not it actually made its way into the database.
I've seen this kind of architecture before. It sounds nice but is loaded with consistency problems.
There's patterns for implementing it. It's not really any different than using more than 1 database which is unavoidable in many scenarios (interacting with 3rd parties)
If the client needs to use the object right there and then, it can start polling the GET for that ID until it returns 200.
A slightly more sophisticated pattern is to have the initial creation command return an ID of the creation event. Then the client can poll that ID and just check that the event has been processed successfully. This requires the client to trust the server a little, but it means it can just poll a small, temporary, probably in-memory "recent events" table instead of hammering the main API.
A more efficient design is to YOLO, submit the second command while the first is still being processed, and trust the server to handle the commands in the right order. This is fine for backend services, unfortunately human users get annoyed at their frontend when it says "sorry you have to do the purchase finalization again because an earlier 'add to cart' command failed and I only noticed it now".
What if the object is deleted again? You could still have an inconsistent picture, no? And you could create and delete over and over again, so that a client has an inconsistent picture all the time.
That's what eventual consistency is, isn't it? It's definitely not the easiest thing to design systems around, but at a big enough scale you sometimes don't have a choice.
There's also many ways to "work around it", so that it doesn't seem inconsistent to the user.
I'm definitely not arguing that it's how most systems should be designed though.
That's a different design issue then - it means lack of synchronization. It is orthogonal to using uuids at a client. In fact, (properly) using uuids are a reasonable way to resolve this.
I don't think you would ever get a 201 Created in an eventual consistency scenario. You would get a 202 Accepted. Unless the API is lying to the clients.
Created can mean different things to different people. Maybe an ecommerce order is created in the sense it's been updated in the inventory database and a record added to some sort of fulfilment system but there's still outstanding data warehouse sync jobs, some order analysis/product recommender job, and some anti fraud jobs (that just need to complete before the order picking in the warehouse begins)
Accepted can mean that it will eventually be created though. In the sense that it's validated and guaranteed to be created at a later point ("eventually").
People should have a look as k-sortable unique identifiers (KSUID). Binary they are represented by 20 bytes and their string representation has 27 characters, which is shorter than UUIDs since the use a base62 encoding. They are sortable since the 20 bytes start with a 32 bit UNIX timestamp followed by random 128 bits. They should be very efficient for clustered indexes / B+-Trees.
Note also that as long as you have a single central database you don't need UUIDs. They are only needed if you have several processes creating objects without coordination.
Note some of the newly proposed UUID formats would take care of some of these issues [1]. They are time-ordered but still have a good bit of random entropy.
Only if you are creating persistent objects in multiple places that may not be in communication with one another, and that at some point need to be distinguished from each other, do you need UUIDs.
Even in mobile apps, which seem like the most common use case for needing them: Unless your app creates these objects while not connected, you don't need UUIDs.
If your backend database is where objects get created, you don't need UUIDs.
> The missing 4 bits is the version number used as a prefix to the time-hi field.
Why would you use 4 bits for a version number in something that's supposed to be unique? What is the benefit of following this specification despite such cost, versus creating 128 unique bits based on time / random generators / machine IDs yourself?
The ability to mix different types of ID in one column or one business data store.
When you start with random UUIDs, and then decide you actually wanted per-host-namespaced ones halfway through, if you have allocated zero bits for the version ID you're up the creek.
You're trading off present-day efficiency for future-day flexibility. Whether that's wise for a particular case depends on that case.
Because 124 random bits are still way enough to make collisions extremely unlikely (needs on the order of 2^62 UUIDs even with birthday paradox - good luck storing them all), yet they are still recognizable as being of that specific format.
does anyone know what data type Reddit uses for their post/comment id? It seems like a short alphanumeric yet unique identifier and much shorter than a uuid.
Also what does twitter use for their post/comment id? Seems like some sort of big int?
That sounds nice and simple, and fits in 64 bits (vs 128):
"we settled on a composition of: timestamp, worker number and sequence number.
Sequence numbers are per-thread and worker numbers are chosen at startup via zookeeper"
> The remaining of the UUID value comes from the MD5 of a random value and the current time at a precision of 1us.
I might be misunderstanding something here but if your random seed is based on time, under high-concurrency, doesn't this risk collisions? I can't see any thread-safety guarantees in the documentation.[1]
As far as I can gather from this post and looking at the data type documentation, MySQL does not have a specific UUID type, but Postgres does.[0] I'll assume that Postgres has some internal optimisations to UUID that MySQL thus lacks.
Addendum: I also realise this is anecdotal, but someone on Stackoverflow mentions a significant speed up from changing `text` to `uuid` in Postgres.[1] But this also fits with what I've been told on #postgresql on libera.chat. That being said, integers would still outperform uuid.
A lot of the problems listed in the post are physical issues with larger data types that are somewhat random - eg the size, how clustered indexes work, and you will have the same problems with them in SQL Server.
If you order the data based on the uuid and your uuid is randomly distributed, then you will almost always be writing the data in the middle of your table, physically. You can cut the impact somewhat by using spare tables (leaving lots of empty space) but eventually you'll be re-writing the data.
SQL Server has a sequential uuid type which avoids exactly this problem.
Disclaimer, I have no data to back up this solves the performance problems described. It's just likely to solve that "writing in the middle of the table" part.
Because you don't want your record IDs to be perfectly sequential data.
You only care about them being sequential enough that your database engine will write them down quickly and efficiently (for engines like InnoDB or MSSQL which have such behaviour). But as a developer you typically want them to be random so that you can generate them in a decentralised manner, prevent guessing or accidental joins, safely merge records from multiple tables, etc.
Sequential UUIDs usually preserve enough randomness for the latter (as long as you don't do stuff like generating a trillion IDs exactly at midnight), while providing enough sequentiality for the former.
But now the ID has a random concatenation of data in it, like the MAC address of the node that inserted it into the database and a timestamp.
It'd be much more orderly for each node to have a sequential ID, the ID of the node that created it, a timestamp when it was created and a real UUID v4 for if someone wants to prevent collisions. Same data, but the primary key on an individual node is a lot smaller (half the size) and the metadata is available for use if someone wants it. There is natural room to build up namespaces and such. It is better to keep these observations separate instead of munging them all into the record ID.
I 100% agree that metadata like timestamps and machine IDs, if you care about them, should have their own columns.
I only disagree with:
> It'd be much more orderly for each node to have a sequential ID
I've outlined some reasons above why a random ID has advantages over a sequential one.
I run Postgres in prod so I can use purely random UUIDv4, but for those running mysql or mssql, UUIDv6 and the like are a useful compromise between sequentiality and randomness. The fact that they happen to use timestamps as the source of sequentiality is only an implementation detail.
There are several problems, one of which is also ergonomy.
I've only read it partially (it's a very interesting read nonetheless), however, this is a key point (italic mine):
> Let’s assume a table of 1B rows having UUID values as primary key and five secondary indexes. If you read the previous paragraph, you know the primary key values are stored six times for each row. That means a total of 6B char(36) values representing 216 GB.
It assumes that MySQL users store UUIDs as CHAR(36), which is very wasteful, since an UUID actually requires 16 bytes (128 bits).
Now, one can store UUIDs a binary blobs in MySQL, however, they are not human-readable, so one tends to store them as CHAR(36) instead, wasting 20 bytes per entry (in total, requiring 2.25 times the strict necessary).
By supporting UUID as native data type, the storage can use the strict necessary amount of bytes, but still maintain readability, because the RDBMS will convert the data to a human-readable form.
Additionally, MySQL's clustered indexes are subject to write amplification, which makes things worse.
I haven't read the rest of the article though, which likely includes other consideration about the spatiality problems due to randomness. Things gets even more complex, due to the relationship with the data structures (I haven't fully read the article).
Postgres is somewhat different mainly because it doesn't use clustered index primary keys. So the row's position on disk is not related to the primary key index entry's position on disk.
Additionally using the less cryptographically secure uuid v1 can be a performance optimization since it has implicit time based sorting.
> Additionally using the less cryptographically secure uuid v1 can be a performance optimization since it has implicit time based sorting.
Except the way the fields are laid out basically defeats the point: UUIDv1 lays a 60 bits timestamp starting from the lower 32 bits, so it only sorts within a 7 minutes (2*32 * 100ns) bucket.
Hence the proposal for UUIDv6, which lays the exact same timestamp in reverse order (starting from the "top" 32b), making it naturally sortable.
Do check that they're not using a "cheating" implementation, I think some DBs have "sequential UUIDs" which are either UUIDv1 the right way up (aka uuidv6) internally, or an other scheme which yields a sequence (e.g. mssql's NEWSEQUENTIALID).
Alternatively, it's possible that they created pseudo-UUIDv1 by hand putting data in UUIDv6.
In my testing (years ago, don't have the data) there was no detectable difference between PKs of the uuid and bigint types in my application, but there was a difference between uuid and int (int was faster). uuid is 16 bytes, bigint is 8 bytes, int is 4 bytes, so at some scale there will be a performance difference.
I've experienced some performance differences with UUIDs vs integers, but it has been minor enough to never cause any issues, even in tables with billions of records.
This topic has been raised more than once at my work and it scared a lot of people.
It's important to understand your use case before you embrace any other solution.
This will affect you only when you are frequently creating and storing new IDs, which leads to reshaping btrees.
If you have IDs and its number is under control and doesn't change a lot you're just fine.
What's the purpose of using non-random data such as time and MAC address in UUID? It increases probability of collision compared to purely random bits and if you need time or MAC it seems better to store them as separate fields - easier to perform selects, groupings, etc.
Why would anyone use a random value as a primary key? I haven’t done databases in a long time but isn’t it standard practice to use a sequential incrementing value for the primary key?
reasons are:
1. no need for coordination in distributed system, no need to check what is the next available id.
2. if userid is visible to users, sequential userid gives away information about the amount of users and allows guessing other valid userids.
Fun improvement from my project a long time ago, uuid has 36 char and to save space, we created a shorten version by removing all the dash character in uuid. The result is 4 char could be removed, the field in MySQL table only needs 32 char.
This seems to be just about using UUIDs as indexes in a DB, not using UUIDs in general as an ID for things which I'm not seeing any reason not to continue doing.
Sure, as long as you never want to look things up or reference them by ID, then there's no reason to worry. Otherwise, yes, you'll have the exact same problems laid out in the article
It doesn't talk about NoSQL or sharding, where random IDs usually perform much better than sequential due to a lack of hot shards.
If you distribute your reads and writes at random across N machines you can get ~Nx performance vs one machine. If you make your writes sequential, you'll usually get ~1x performance because every insert goes to the same machine for this second/minute/day, then rolls to a new one for the next period. There are sharding schemes that can counter this, but they require insight into your data design before implementation.