> if an operation has been synchronized across all endpoints, no new operations will occur concurrently with it, allowing it to be safely removed from the history.
This assumes that the set of endpoints (really, nodes) is both well-known by all other nodes in the network, and stable over time (meaning new nodes will never be added).
Even if this assumption can be made safely (which is not a given) the GC process described here is still an optimization, which would be subverted when even a single node in the network became slow or broken.
It's also basically orthogonal to the concept of "tombstones", which are still required if you want to delete anything from the data structure.
Similar to OT, in certain scenarios, it's sufficient to ensure that only a subset of peers have the complete data, while others don't need the full history. For instance, in real-time collaboration scenarios with a central server, we can, just like OT, allow clients to hold only a shallow clone instead of the complete history. This approach results in minimal overhead for the clients.
I guess it all depends on how you define "client" and "shallow history", and the guarantees you provide around propagation of that history.
But if you have a central server that is considered to be the authoritative source of state, and assuming clients interact with that central server directly, then I'm not sure what is accomplished by modeling your data with CRDTs in the first place?
This assumes that the set of endpoints (really, nodes) is both well-known by all other nodes in the network, and stable over time (meaning new nodes will never be added).
Even if this assumption can be made safely (which is not a given) the GC process described here is still an optimization, which would be subverted when even a single node in the network became slow or broken.
It's also basically orthogonal to the concept of "tombstones", which are still required if you want to delete anything from the data structure.