There are other reasons to choose NoSQL
For example, when Craigslist was using mySql and they had to change their schema, it took MONTHS to facilitate the change across all their slaves.
You can also have a mixed strategy of using both RDBMS and NoSQL to achieve consistency while being able to be flexible to architecture changes.
Lastly- have you looked at total overal cost? Setting up a large cluster with mySql will have a large operational cost and it may not be partition tolerant so if the wrong servers go down, it may cascade to your whole data store.
Doing data migrations up front in a NoSQL system means you've changed the fields around in your document types, then you're done. Now 95% of your documents are wrong, missing those changes. Maybe you have linkages between different types of document (akin to a foreign key) - now a lot of those might be pointing to nothing.
In this model, it's the application's job to anticipate and work around these inconsistencies. Assertions that check for data integrity, if you have them, have to be modified to work around this. For a lot of the web applications we talk about these days, who cares - it's people's lists of friends and TODO notes. It's simple data and some dangling records aren't going to hurt anyone.
In the SQL world, we instead write migration scripts that migrate the structure and data all at once. This is a little more work up front, but as long as you stuck to a mostly normalized form and use great tools (which we now have) this is not a big deal ("MONTHS" to migrate across slaves sounds like they had some very wrong decisions made earlier on). The application can move straight to supporting the new structure and entirely forget that the old one existed. In this world, we can also have really complex relationships between fields, like when we're storing accounting or medical data linked to temporal records of changes. The application can consume this structure without worrying about consistency.
I think the problem is that when you change a table structure in MySQL, it takes quite a bit of time to do the actual writes. Adding a column tends to force a rewrite of the table. Braindead I know....
I think the problem is that MySQL isn't really a standard SQL-world db. It has some of the advantages of one, but not all of them and some very annoying gotchas.
> The application can move straight to supporting the new structure and entirely forget that the old one existed.
Not always. Changing the schema can break an application, in particular when the database supports multiple applications.
> It's simple data and some dangling records aren't going to hurt anyone.
It depends on the application. In healthcare people literally die due to some dangling inconsistent records.
The problem in anticipating and working around these inconsistencies is that the workaround is added ad-hoc in code, rather than through a model defining the data change. You need a model http://chronicdb.com/preparing_schema_changes
PostgreSQL can do the majority of migrations you would need to make instantaneously (adding columns that can be null, adding/dropping indexes etc). MySQL suffers roughly O(rows) time when changing the schema, but with PostgreSQL now at feature parity for most everything else there's little reason to go NoSQL over something like Postgres if you are worried about schema changes. With respect to fault tolerance, this is only a problem if your dataset outgrows a single master (pretty rare) and you haven't figured out a proper way to shard (most places shard on users, it's pretty hard to believe anything is un-shardable).
I don't want to presume that I'm smarter than Craigslist's engineers, but why is this so hard? Schema changes should be captured in migrations, which are programs, and could be coordinated to run in parallel on thousands of machines in a moment with a shell script over SSH.
The fact that ALTER TABLE locks on most databases is not the primary reason changing a schema is hard. The primary reason is that a schema change breaks the application [1].
Migrations are programs, but your existing live application is also a program. Changing the schema breaks your live application to prepare for the new version of the application.
This is true for NoSQL too. If you change the format of your documents you applications might need to be modified to support both the old and the new format at the same time.
MySQL had some historic issues with schema changes -- adding a column to a table with a few million records could take hours. While it does have a lot of work to do -- it is restructuring every page of the database -- in competitive products such a change takes a minute fraction of the time (limited only by IO performance).
Most changes would still block, yes, but at least in PostgreSQL many changes will only block a short time (since they do not require a full table rewrite, only altering the table descirption) and in my experience rewriting a table is generally quick for a decent sized database (tables with a handful of millions of rows).
Flexibility of schema definition and flexibility of schema change are two different things. Defining schemas only involves data. But changing schemas involves not just data, but code too.
There are other reasons to choose NoSQL For example, when Craigslist was using mySql and they had to change their schema, it took MONTHS to facilitate the change across all their slaves.
That isn't a reason to choose NoSQL. That's a reason to-
-doubt MySQL -- many of the purported downsides of SQL solutions (for instance the ridiculous "avoid joins" meme that has zero bearing on any good database product) are actually MySQL problems. Or rather, they were -- the product has made some pretty incredible strides.
-understand and embrace normalization. Most of the "we keep trying to change our schema and SQL is just so restrictive" stories could often be described as "the problems with denormalization".