One box is the master; the others are slaves. And yes, the easy way to do this is system having single write lock.
That seems ridiculous if you are thinking like databases do, in terms of taking away the pain of all those disk seeks needed to write something. But if everything is hot in RAM, executing a command is extremely fast. Much faster than a database.
If that still isn't fast enough, you can split your data graph into chunks that don't require simultaneous locking and have one lock per. For example, if you are making a stock exchange like the LMAX people were, you can have one set of data (and one lock) per stock.
The lock doesn't need to just cover one write. It needs to cover the whole transaction. The canonical example is that of incrementing a counter. Replica synchronization aside, without a transaction lock (or some other guarantee of transaction consistency), at some point you will read a counter value which another client is in the middle of updating.
The first "fix" to this that comes to mind is timestamping data, rolling back transactions which try to write having read outdated data. Do extant NoDB systems do this?
Yes, these approaches provide proper transactions, but the approach is much simpler than you imagine.
Imagine you have a graph of objects in RAM. Say, a Set of Counters. Imagine you want to change the graph by incrementing a counter. To do this create an object, the UpdateCounterCommand, with one method: execute. You hand the command to the system, and it puts it in a queue. When it gets to the top, it serializes the command to disk and then executes it. Exactly one write command runs at a time.
For a real-world example, check out Prevayler. It provides all the ACID guarantees that a database does, but in a very small amount of code.
That seems ridiculous if you are thinking like databases do, in terms of taking away the pain of all those disk seeks needed to write something. But if everything is hot in RAM, executing a command is extremely fast. Much faster than a database.
If that still isn't fast enough, you can split your data graph into chunks that don't require simultaneous locking and have one lock per. For example, if you are making a stock exchange like the LMAX people were, you can have one set of data (and one lock) per stock.