I've used the "snowflake" like approach in the past with great success. It's really not all that complicated. The reliance on time in the Instagram approach is a bit scary. A few ms off here and there could really hurt this scheme. How do you handle seamlessly transitioning these across machines when your shards move?
Thanks for the comment! We don't need the IDs to be exactly sortable, only roughly sortable within a second or so. As long as the clock doesn't move backwards on any given machine (we use ntpd in its gradual-adjustment mode), the IDs are unique.
The way we move shards is to use PostgreSQL's built-in streaming replication to create an exact, in-sync copy of a set of tablespaces, then 'fail over' to a new machine and start reading/writing to a subset of those tablespaces (this is similar to how Facebook describes their shard-moving process).
Not really. Like MongoDB, the first four bytes of an ObjectId are a timestamp. That the timestamp is synced with other instances isn't paramount because the actual value of the timestamp does not matter. What does matter is that the timestamp is new and increasing every second. This is to retain sorting capabilities. With the 13 bits that represent the logical shard ID from this article, Instagram will guarantee uniqueness of an ID within the granularity of a second.
"What does matter is that the timestamp is new and increasing every second."
Right. I'm just sayin' that you have to be careful when you move data with a caveat like that. Moving the shard keyspace (the 13 bits) to a new machine that started generating ID's even one second behind (the first 4 bytes) would be troublesome, no?
Yep--definitely something to watch out for. At worst, though, you'd have a duplicate key when trying to insert, and can re-try without the risk of having a duplicate ID floating around your system.
The lower rotating 10 bits should give them a reasonable safety margin. If they're creating less than 128 entries in a particular shard per second (right now they're doing that across their entire datastore), their clocks would need to be out by 8 seconds to cause a problem.
They should definitely be monitoring their clocks though :)