> I really don't understand how anything of what you wrote follows from the fact...

Tainnor · on Aug 3, 2022

> Sure, if you can represent what the user wants to do as a "command" like that, that doesn't rely on a particular state of the world, then you're fine. Note that this is also exactly the case that an eventually consistent event-sourcing style system will handle fine.

Yes, but the event-sourcing system (or similar variants, such as CRDTs) is much more complex. It's true that it buys you some things (like the ability to roll back to specific versions), but you have to ask yourself whether you really need that for a specific piece of data.

(And even if you use event sourcing, if you have many events, you probably won't want to replay all of them, so you'll maybe want to store the result in a database, in which case you can choose a relational one.)

> If two people try to edit the same wiki page at the same time, either one of them loses their data, or you implement some kind of "userspace" reconciliation logic - but database transactions can't help you with that.

Yes, but

a) that's simply not a problem in all situations. People will generally not update their user profile concurrently with other users, for example. So it only applies to situations where data is truly shared across multiple users, and it doesn't make sense to build a complex system only for these use cases,

b) the problem of users overwriting other users' data is inherent to the problem domain; you will, in the end, have to decide which version is the most recent regardless of which technology you use. The one thing that evens etc. buy you is a version history (which btw can also be implemented with a RDBMS), but if you want to expose that in the UI so the user can go back, you have to do additional work anyway - it doesn't come for free.

c) Meanwhile, the RDBMS will at least guarantee that the data is always in a consistent state. Users overwriting other users' data is unfortunate, but corrupted data is worse.

d) You can solve the "concurrent modification" issue in a variety of ways, depending on the frequency of the problem, without having to implement a complex event-sourced system. For example, a lock mechanism is fairly easy to implement and useful in many cases. You could also, for example, hash the contents of what the user is seeing and reject the change if there is a mismatch with the current state (I've never tried it, but it should work in theory).

I don't wish to claim that a relational database solves all transactionality (and consistency) problems, but they certainly solve some of them - so throwing them out because of that is a bit like "tests don't find all bugs, so we don't write them anymore".

lmm · on Aug 3, 2022

> Yes, but the event-sourcing system (or similar variants, such as CRDTs) is much more complex.

It's really not. An RDBMS usually contains all of the same stuff underneath the hood (MVCC etc.), it just tries to paper over it and present the illusion of a single consistent state of the world, and unfortunately that ends up being leaky.

> a) that's simply not a problem in all situations. People will generally not update their user profile concurrently with other users, for example. So it only applies to situations where data is truly shared across multiple users,

Sure - but those situations are ipso facto situations where you have no need for transactions.

> b) the problem of users overwriting other users' data is inherent to the problem domain; you will, in the end, have to decide which version is the most recent regardless of which technology you use. The one thing that evens etc. buy you is a version history (which btw can also be implemented with a RDBMS), but if you want to expose that in the UI so the user can go back, you have to do additional work anyway - it doesn't come for free.

True, but what does come for free is thinking about it when you're designing your dataflow. Using an event sourcing style forces you to confront the idea that you're going to have concurrent updates going on, early enough in the process that you naturally design your data model to handle it, rather than imagining that you can always see "the" current state of the world.

> c) Meanwhile, the RDBMS will at least guarantee that the data is always in a consistent state. Users overwriting other users' data is unfortunate, but corrupted data is worse.

I'm not convinced, because the way it accomplishes that is by dropping "corrupt" data on the floor. If user A tries to save new post B in thread C, but at the same time user D has deleted that thread, then in a RDBMS where you're using a foreign key the only thing you can do is error and never save the content of post B. In an event sourcing system you still have to deal with the fact that the post belongs in a nonexistent thread eventually, but you don't start by losing the user's data, and it's very natural to do something like mark it as an orphaned post that the user can still see in their own post history, which is probably what you want. (Of course you can achieve that in the RDBMS approach, but it tends to involve more complex logic, giving up on foreign keys and accepting tha you have to solve the same data integrity problems as a non-ACID system, or both).

> d) You can solve the "concurrent modification" issue in a variety of ways, depending on the frequency of the problem, without having to implement a complex event-sourced system. For example, a lock mechanism is fairly easy to implement and useful in many cases. You could also, for example, hash the contents of what the user is seeing and reject the change if there is a mismatch with the current state (I've never tried it, but it should work in theory).

That sounds a whole lot more complex than just sticking it an event sourcing system. Especially when the problem is rare, it's much better to find a solution where the correct behaviour naturally arises in that case, than implement some kind of ad-hoc special case workaround that will never be tested as rigorously as your "happy path" case.

Tainnor · on Aug 4, 2022

> It's really not. An RDBMS usually contains all of the same stuff underneath the hood (MVCC etc.), it just tries to paper over it and present the illusion of a single consistent state of the world, and unfortunately that ends up being leaky.

There's nothing leaky about it. Relational algebra is a well-understood mathematical abstraction. Meanwhile, I can just set up postgres and an ORM (or something more lightweight, if I prefer) and I'm good to go - there's thousands of examples of how to do that. Event-sourced architectures have decidedly more pitfalls. If my event handling isn't commutative, associative and idempotent I'm either losing out on concurrency benefits (because I'm asking my queue to synchronise messages) or I'll get undefined behaviour.

There's really probably no scenario in which implementing a CRUD app with a relational database isn't going to take significantly less time than some event sourced architecture.

> Sure - but those situations are ipso facto situations where you have no need for transactions.

> Using an event sourcing style forces you to confront the idea that you're going to have concurrent updates going on

There are tons of examples like backoffice tools (where people might work in shifts or on different data sets), delivery services, language learning apps, flashcard apps, government forms, todo list and note taking apps, price comparison services, fitness trackers, banking apps, and so on, where some or even most of the data is not usually concurrently edited by multiple users, but where you still will probably have consistency guarantees across multiple tables.

Yes, if you're building Twitter, by all means use event sourcing or CRDTs or something. But we're not all building Twitter.

> If user A tries to save new post B in thread C, but at the same time user D has deleted that thread, then in a RDBMS where you're using a foreign key the only thing you can do is error and never save the content of post B.

I don't think I've ever seen a forum app that doesn't just "throw away" the user comment in such a case, in the sense that it will not be stored in the database. Sure, you might have some event somewhere, but how is that going to help the user? Should they write a nice email and hope that some engineer with too much time is going to find that event somewhere buried deep in the production infrastructure and then ... do what exactly with it?

This is a solution in search of a problem. Instead, you should design your UI such that the comment field is not cleared upon a failed submission, like any reasonable forum software. Then the user who really wants to save their ramblings can still do so, without the need of any complicated event-sourcing mechanism. And in most forums, threads are rarely deleted, only locked (unless it's outright spam/illegal content/etc.)

(Also, there are a lot of different ways how things can be designed when you're using an RDBMS. You can also implement soft deletes (which many applications do) and then you won't get any foreign key errors. In that way, you can still display "orphaned" comments that belong to deleted threads, if you so wish (have never seen a forum do that, though). Recovering a soft deleted thread is probably also an order of magnitude easier than trying to replay it from some events. Yes, soft deletes involve other tradeoffs - but so does every architecture choice.)

> That sounds a whole lot more complex than just sticking it an event sourcing system. Especially when the problem is rare, it's much better to find a solution where the correct behaviour naturally arises in that case.

I really disagree that a locking mechanism is more difficult than an event sourced system. The mechanism doesn't have to be perfect. If a user loses the lock because they haven't done anything in half an hour, then in many cases that's completely acceptable. Such a system is not hard to implement (I could just use a redis store with expiring entries) and it will also be much easier to understand, since you now don't have to track the flow of your business logic across multiple services.

I also don't know why you think that your event-sourced system will be better tested. Are you going to test for the network being unreliable, messages getting lost or being delivered out of order, and so on? If so, you can also afford to properly test a locking mechanism (which can be readily done in a monolith, maybe with an additional redis dependency, and is therefore more easily testable than some event-based logic that spans multiple services).

And in engineering, there are rarely "natural" solutions to problems. There are specific problems and they require specific solutions. Distributed systems, event sourcing etc. are great where they're called for. In many cases, they're simply not.