> That's true about any database, not just MongoDB; nothing new here.
That's exactly the point where we started. A well-designed system fails "safe"; it should obey the principle of least surprise. Specifically: MongoDB should default to synchronous writes to disk on every commit; official drivers should default to acknowledging every network call; MongoDB shouldn't allow remote access from the network by default. Once you want higher performance or remote access, you can read about the configuration options to change and learn on-the-fly, evaluating the trade-offs as needed.
Other systems are safe by default (e.g. PostgreSQL), and their out-of-the box performance and setup complexity suffers because of it. MongoDB could ship "safe" (with the same trade-offs), but chooses not to. That sort of marketing-led decision-making has no place in my technology stack.
The Principle of Least Surprise has local scope. You may be surprised to find asynchronous writes on an arbitrary database, but not for a database that is documented, advertised, and marketed as asynchronous-by-default.
'Surprise' is relative to the current environment and paradigm (in this case, asynchronicity)- if you find that surprising, then that means that you should have read the basic documentation properly.
> MongoDB could ship "safe" (with the same trade-offs), but chooses not to.
Because that's one of the main points of choosing MongoDB...
This "main point" is never mentioned in their philosophy page. And the introduction mentions "Optional streaming writes (no acknowledgements)" which sounds like the default is synchronous writes.
I admit that the default unsafe tuning of MongoDB becomes quite obvious when you read more of the manual, but I can hardly say 10gen is without blame for causing this confusion.
I understand where you're coming from, though I disagree.
I hope you continue to explain these caveats to everyone considering MongoDB. I hope you recognize that not everyone is an expert in these limitations, and that you clearly explain to those that might not know it that MongoDB's "2GB limit" really means "data loss"; as does 'asynchronous'. Then you'll see fewer blog posts from people that didn't see through the marketing speak and were bitten by the defaults.
Right now, I think all these blog posts describing MongoDB losing data or performing poorly are getting upvoted because people are learning of these limitations for the first time.
That's exactly the point where we started. A well-designed system fails "safe"; it should obey the principle of least surprise. Specifically: MongoDB should default to synchronous writes to disk on every commit; official drivers should default to acknowledging every network call; MongoDB shouldn't allow remote access from the network by default. Once you want higher performance or remote access, you can read about the configuration options to change and learn on-the-fly, evaluating the trade-offs as needed.
Other systems are safe by default (e.g. PostgreSQL), and their out-of-the box performance and setup complexity suffers because of it. MongoDB could ship "safe" (with the same trade-offs), but chooses not to. That sort of marketing-led decision-making has no place in my technology stack.