So you run naked during the time of the master rebuild? — The only sensible solution is to run two slaves at least, IMHO. — I haven't looked, but do you promote that?
Actually, that will be a recommended config for replica sets. Most of our slides already show 3 replicas per set.
Also, most people have backups so its not really "running naked". See http://www.mongodb.org/display/DOCS/Backups for a few ways to backup mongodb. With LVM/EBS/ZFS or any other snapshotable filesystem, backups can be done almost instantly. With EBS you can even get an insta-slave from the snapshot.
the suggested default step 1 for MongoDB is to acquire 3 servers?
i mean, no other database suggests such a huge default configuration. even knowing that their datacenter can get hit by lightening and all a lot of large production sites don't even run with this kind of redundancy.
this seems like a pretty taxing workaround for not keeping an append-only transaction log.
No, it is a suggested configuration, not the suggested one. And recommending three nodes is not that uncommon for distributed systems, because you cant have a quorum with only two nodes. Some users are more concerned with handling 10s or 100s of servers than having single-server durability.
That said, for most users two servers are fine. Other users don't need any replicas at all since they do a nightly dump from a stored source into mongodb, or they just take regular backups. There are many ways to achieve system-wide durability, not all of them require the database to be durable.
"the suggested default step 1 for MongoDB is to acquire 3 servers?"
At my previous job, we suffered a catastrophic RAID failure. The controller died while trying repair a degraded array. At this point, we valued the data enough that we weren't going to screw around with replacing the controller and restarting the recovery. The whole RAID array went out to DriveSavers, at the cost of several thousand dollars. Now, DriveSavers are really awesome folks, but my goal is to never do business with them again. :-)
After this incident, our policy was simple: If our data mattered, it had to be replicated.
A 3 server configuration is, admittedly, pretty high end: Even if one node is down, you still have redundancy. You can keep the remaining 2 nodes live while rebuilding the failed node.
I know lots of people who care _deeply_ about data integrity, but who keep their databases on a single server. I find this a bit mystifying: Even the most expensive hardware can die in ugly and unrecoverable ways. And RAID arrays are some of the biggest culprits: Their striping formats are usually undocumented and proprietary.