> There is no reason mongo couldn't be clear about this distinction -- say, rename "insert" to "async_insert" and have "insert" be a wrapper around async_insert and getLastError. But instead, it's the user's fault because they didn't read the docs.
Because if you don't read enough of the docs to understand that 'insert' is asynchronous insert, you don't understand MongoDB and haven't done your research.
Why should 'insert' default to synchronous? Why shouldn't we instead have a sync_insert function instead? The only reason is that you're assuming familiarity for people coming from SQL/synchronous-oriented DBMS, but why should they be forced into an awkward design just because it's what people are familiar with from other DBMS?
A good system is forgiving; it encourages exploration; if there's a choice between safety and performance it defaults to safety. If/when profiling shows the safe behaviour to be a bottleneck, then users can Google the issue and discover "Oh, I just need to set flag X; I can live with the consequences here".
Expecting the user to be an expert in your product from the start is simply not realistic; a well-designed system facilitates use by people of varying levels of expertise.
> A good system is forgiving; it encourages exploration; if there's a choice between safety and performance it defaults to safety.
Not if you're choosing a system that's explicitly marked for performance over safety.
> Expecting the user to be an expert in your product from the start
The 'product' in this case is a non-relational database, not an iGadget. The user can and should be expected to be familiar with the main strengths and weaknesses of the database as a whole.
There is no way you can convince me that someone who has done a reasonable level of due-diligence in investigating MongoDB can be surprised when it behaves asynchronously.
Kudos to you for doing your research. If you're saying "don't use MongoDB without doing at least N days of research first", then you're very much at odds with (my perception of) the 10gen marketing message.
I think you're right though: MongoDB should not be used without _lots_ of research into its limitations.
> I think you're right though: MongoDB should not be used without _lots_ of research into its limitations.
That's true about any database, not just MongoDB; nothing new here.
> then you're very much at odds with (my perception of) the 10gen marketing message.
10Gen is fairly straightforward about the original issue, having blogged openly several times about their decisions - but at the end of the day, any engineer should do research beyond the simple marketer's pitch.
I won't doubt that there are people who make snap judgements about fundamental architecture based on marketing pitches[1], but that's very unfortunate, and the marketers really can't be blamed, especially when they make no effort to conceal the truth or deceive you!
> That's true about any database, not just MongoDB; nothing new here.
That's exactly the point where we started. A well-designed system fails "safe"; it should obey the principle of least surprise. Specifically: MongoDB should default to synchronous writes to disk on every commit; official drivers should default to acknowledging every network call; MongoDB shouldn't allow remote access from the network by default. Once you want higher performance or remote access, you can read about the configuration options to change and learn on-the-fly, evaluating the trade-offs as needed.
Other systems are safe by default (e.g. PostgreSQL), and their out-of-the box performance and setup complexity suffers because of it. MongoDB could ship "safe" (with the same trade-offs), but chooses not to. That sort of marketing-led decision-making has no place in my technology stack.
The Principle of Least Surprise has local scope. You may be surprised to find asynchronous writes on an arbitrary database, but not for a database that is documented, advertised, and marketed as asynchronous-by-default.
'Surprise' is relative to the current environment and paradigm (in this case, asynchronicity)- if you find that surprising, then that means that you should have read the basic documentation properly.
> MongoDB could ship "safe" (with the same trade-offs), but chooses not to.
Because that's one of the main points of choosing MongoDB...
This "main point" is never mentioned in their philosophy page. And the introduction mentions "Optional streaming writes (no acknowledgements)" which sounds like the default is synchronous writes.
I admit that the default unsafe tuning of MongoDB becomes quite obvious when you read more of the manual, but I can hardly say 10gen is without blame for causing this confusion.
I understand where you're coming from, though I disagree.
I hope you continue to explain these caveats to everyone considering MongoDB. I hope you recognize that not everyone is an expert in these limitations, and that you clearly explain to those that might not know it that MongoDB's "2GB limit" really means "data loss"; as does 'asynchronous'. Then you'll see fewer blog posts from people that didn't see through the marketing speak and were bitten by the defaults.
Right now, I think all these blog posts describing MongoDB losing data or performing poorly are getting upvoted because people are learning of these limitations for the first time.
It's not that way because somebody in the 70's flipped a coin and decided that sync was heads.
It's because it's a reasonable assumption to make. Data loss shouldn't be a surprise, if I need speed and am willing to risk dataloss I should have the option, but should explicitly choose to use it.
> if I need speed and am willing to risk dataloss I should have the option, but should explicitly choose to use it.
You did, by choosing to use MongoDB.
(And if you chose MongoDB without being aware of that implication, you didn't choose MongoDB for the right reasons or didn't do your due diligence, because you cannot understand MongoDB's use case and tradeoffs if you were unaware of this.)
Because if you don't read enough of the docs to understand that 'insert' is asynchronous insert, you don't understand MongoDB and haven't done your research.
Why should 'insert' default to synchronous? Why shouldn't we instead have a sync_insert function instead? The only reason is that you're assuming familiarity for people coming from SQL/synchronous-oriented DBMS, but why should they be forced into an awkward design just because it's what people are familiar with from other DBMS?