Holy god. The post mentions "We evaluated a few different NoSQL solutions, but (...)".
I couldn't even imagine having to _think_ about this sort of thing; CouchDB makes this an absolute no-brainer. I mean, the act of creating a document assigns it a UUID in the database _by_ _default_.
Or, do you want to fetch a UUID before assigning it to any data? localhost:5984/_uuids
Want to fetch 10? localhost:5984/_uuids?count=10
Want to fetch a million? localhost:5984/_uuids?count=1000000
Instagram seems like the absolutely perfect candidate for CouchDB -- unstructured data, speed, HTTP querying, big data, attachments...
I used to work at Meebo, which hosts a large CouchDB system , but when it came time to choose a solution for Instagram we went with PGSQL. There was a lot I really liked about Couch, especially for the analytics system we built at Meebo (where the map/reduce views worked great), but I wouldn't classify it as a low-Ops-burden technology--like any newer db solution, there are some rough edges and more 'unknowns' at scale.
Also, we'd still have to write the middleware to assign data to shards and fetch data from shards in our system (unless we used something like BigCouch), so having a more tried-and-tested solution that we already understood well was more appealing.
The call to default _uuids API call generates random UUIds, however you could override the call in the _config to have a different algorithm field, one that potentially could call "utc_random" still while appending a timestamp in the string to sort by later. Was this thought about when CouchDB was potentially considered?
I couldn't even imagine having to _think_ about this sort of thing; CouchDB makes this an absolute no-brainer. I mean, the act of creating a document assigns it a UUID in the database _by_ _default_.
Or, do you want to fetch a UUID before assigning it to any data? localhost:5984/_uuids Want to fetch 10? localhost:5984/_uuids?count=10 Want to fetch a million? localhost:5984/_uuids?count=1000000
Instagram seems like the absolutely perfect candidate for CouchDB -- unstructured data, speed, HTTP querying, big data, attachments...