Hacker News new | past | comments | ask | show | jobs | submit login

After having built an app that was supposed to work offline, and resync when the connexion is up, using OT as well ( A basic one), i have the feeling that the general problem is hard, but if you stick to trying to solve your specific problem, things become much simpler. As an example OT operations can be either generic such as "update a property of an object" and then good luck solving conflicts, or very qualified with a business context such as "transfert money from account a to b", in which case the server-side code with which you synchronize will be much more able to resolve any issue case per case.



(Alexander from Realm here) Yes, where this approach fits is very dependent on the use case. In our experience the key is to have a data model that is flexible enough to express many different use cases. There might be cases where updating a single field is enough, or you might need it to work as a counter, or maybe your data will be better modeled by ordered insertions into a list.

Having the flexibility to use the exact combination of data types that fits your specific requirements and allows you to express your intent in a way that makes it clear how you want the conflicts resolved is essential to make it work.

You can find a bit more info about our approach to conflict resolution here: https://realm.io/docs/realm-object-server/#conflict-resoluti...


That design doc seems to mix loose ideas with the current state of the system. For instance, are strings last-write-wins currently, or do they have character-level OT? Congrats on the release - I'm a huge fan of realtime, and I'd like nothing more than to build apps that depend on such a system - but seeing documentation with those kinds of holes gives a really bad first impression.


Sorry the docs are not clear. Right now you can only apply set operations on the entire string. Internally in the database core, we support substring operations, but we haven't exposed them yet mainly because we need to add the compliment to listen for these changes. We have a working collaborative text editor demo/prototype so hope to have more to say soon!


Thanks for the link. when you mention "time" in the doc, what's the definition of "most recent" ? Do you mean, last one received by the server ? Or do you have any hint based on local state (either local time clock, or local logical clock) ? That would make a huge difference if, let's say, i reconnect an old unsynchronized device with pending, outdated, changes.


Time means logical clock. We incorporate a number of mechanisms to track causality, such that for each changeset we know exactly which other changesets it was based on. Timestamps are only used when two changesets that are causally unrelated conflict, in which case we use the timestamp to decide on the ordering.


How do you know whether timestamps are synchronized if network is partitioned?


We don't, and we don't need to.

Timestamps are only used to merge conflicting but causally unrelated changes. In principle we could pick a random number instead of using the timestamp and we would still achieve convergence, but it just so happens that the current local time on the device is highly correlated with the user's experience of real time, so that if the user makes conflicting changes on two offline devices, those changes will still be properly ordered in the general case.


OK good luck with that. I guess you keep a graph of all the changes so that the accidental overwrites that will happen+ can be stepped back?

+ for example a surprising number of computers have their local time changed to avoid license terms of poorly enforced proprietary software.


You are absolutely right - the general problem is super hard. But even solving a specific use-case is hard for most developers who haven't done it before. It just takes a ton of time, and you often get some details wrong. And when you think you nailed it, then new requirements turns up :-) The result is that many apps don't work very well offline. That's why we have spent several man-years on making a general solution with sufficient flexibility to cover most use-cases. We want to enable all developers to be able to add offline first and real-time features without the hassle of reinventing the wheel every time.


Another reason that OT is hard is that it doesn't fall out of a generic proof; you have to prove that your particular instance of OT follows a mathematical equation. And there is no tooling to verify that (it can't be ensured by typical type checkers).

So people either rely on an OT implementation that is generic enough for their needs, or risk inconsistency or subtle bugs to chase after for years.

Other approaches such as CRDT or total order are not vulnerable to that problem, but they have other concerns (like read performance, garbage collection, or intent preservation).


This is an interesting point. We at Realm have actually spent significant time thinking about it, and it is absolutely true that there is no easy way to prove the correctness of a system using OT.

But it helps to understand the fundamental constraints - for example, that operations must be commutative. In our case, we have had the luxury of designing our own database system, which means we could pick the semantics that we knew would lend themselves well to operational transformation. We have also made an effort to reuse semantics at multiple levels, limiting the number of OT instances that we had to convince ourselves would work.

Then of course, and for me at least, formal arguments are not enough, which is why we have spent a remarkable amount of resources on testing the system, including guided fuzz testing. :-)


Some advice : if you want to have people trust and thus use this feature, you should make this kind of talk :

https://youtu.be/4fFDFbi3toc




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: