I have decided on wanting to use riak as well. I was wondering if anyone had examples of how they used it with their data model?
For example this article mentions "With appropriate logic (set unions, timestamps, etc) it is easy to resolve these conflicts" however timestamps are not an adequate way to do this due to distributed systems having partial ordering. The magicd may be serialising all requests to riak to mitigate this (essentially using the time reference of magicd) in which case they're losing out on the distributed nature of riak (magicd becomes a single point of failure / bottleneck).
Insight into how others have approached this would be awesome.
There are a several ways to approach this. The simplest is to just take last-write-wins, which is the only option some distributed databases give you. For cases where this isn't ideal, you resolve write-conflicts in a couple ways.
One way is to write domain-specific logic that knows how
to resolve your values. For example, your models might
have some state that only happen-after another state,
so conflicts of this nature resolve to the 'later' state.
Another approach is to use data-structures or a library
designed for this, like CRDTs. Some resources below:
Unless I'm missing something, I would assume they run magicd on all servers that run the application. Thus Riak's degree of redundancy is independent of magicd's degree of redundancy since each instance of magicd can communicate to the entire Riak pool.
Yep! This is exactly how it works. Each app node runs a magicd which connects to an haproxy instance on localhost (connected to every machine in the database cluster), so when a Riak node goes down we don't miss a beat.
For example this article mentions "With appropriate logic (set unions, timestamps, etc) it is easy to resolve these conflicts" however timestamps are not an adequate way to do this due to distributed systems having partial ordering. The magicd may be serialising all requests to riak to mitigate this (essentially using the time reference of magicd) in which case they're losing out on the distributed nature of riak (magicd becomes a single point of failure / bottleneck).
Insight into how others have approached this would be awesome.