"It gets hard when you want to do something such as getting the comments for an article. You can't say, get me the comments for article 450. You have to actually store a list of the comment ids in the article, get the article, and then request the comments based on that list in the article object. Worse, lets say two people comment on an article at the same time. You'd better make sure you start a transaction before reading the object, modify the list, and save it. If you don't, you'll hit race conditions that can see a comment's association lost (something you don't have to worry about with foreign-key associations)."
----------
I'm glad you brought this up. I've been working on a project that "solves" this issue. It's basically a distributed memory hash-table like memcached, but the values are array that can be modified in a CRUD manner. I call Alchemy (http://github.com/teej/alchemy/tree/master)
Where you could then look up your three article comments out of Memcache, MySQL, or whatever else. At the moment, it's mostly just a proof of concept, though I am using it in a small live app. I feel like a true distributed, super-fast, relational, in-memory database is close, there are just a few more pieces of the puzzle to work out. For me, Alchemy is one of those pieces.
So, you can embed it right in the object just as easily as setting up another object in the store just for the ids.
That helps with foreign-key type situations, but it doesn't help for a lot of other things.
I guess my perspective on it is that the restrictions are fine for small apps with few features, but for small (size of data) apps you don't need 20,000 operations per second. MySQL handles it easily. Likewise, if you're dealing with a bigger app, you're going to want to be able to query data in more ways. MemcacheDB is great for what it is: a key-value store. It's not a replacement for a RDBMS.
In fact, you really can't create such a database as you propose. The issue is that there are problems in computer science that grow at certain rates. For example, sorting is an n * log(n) problem. You simply can't do better than that. That's how indexes work (very simplistically). If you limit yourself to operations that you can do in constant time, you have hash tables. You can't query a hash table except by key. Distributed hash tables are decently well known systems, but they don't replace databases in most applications and you can't make them more database like unless you eliminate the property that makes them scale so well.
"So, you can embed it right in the object just as easily as setting up another object in the store just for the ids."
That doesn't solve any of the issues for which I built Alchemy. The point was to create an array store that could be reliably modified without transactions. I built it after having real-life issues with race conditions in Memcache where transactions weren't an acceptable alternative.
In the end, I really would like to see cache solutions that can be easily dropped in for the different bottleneck situations, but keeping the standard RDBMS.
----------
I'm glad you brought this up. I've been working on a project that "solves" this issue. It's basically a distributed memory hash-table like memcached, but the values are array that can be modified in a CRUD manner. I call Alchemy (http://github.com/teej/alchemy/tree/master)
So in your example you would have:
Where you could then look up your three article comments out of Memcache, MySQL, or whatever else. At the moment, it's mostly just a proof of concept, though I am using it in a small live app. I feel like a true distributed, super-fast, relational, in-memory database is close, there are just a few more pieces of the puzzle to work out. For me, Alchemy is one of those pieces.