Reading back my own comments, I must apologize for my tone. I didn't mean to come off so negative. In fact, I do think your FSM is pretty cool.
So let me explain in more detail how you could solve this without all of the overhead you have created. Assuming a true FSM, there is no need to store the state as a value in your database (e.g. "awaiting_payment" in a [state] field). Instead, you should be storing your states as BIT fields:
to store when each state was entered, if at all. In this way, every order is in a single row and your analytics become trivial queries that don't require any complex joins or correlated sub-queries. Additionally, using simple constraints on each state field (e.g. the [awaiting_shipment] field <> 1 if the [awaiting_payment_timestamp] is NULL) would remove the need for your trigger and, again, reduce overhead.
This is called an "accumulating snapshot" table, and it is used in data warehousing to solve the exact problem you are dealing with here - tracking an entity through a series of predetermined events to monitor it's status and mark milestones. Using a "transaction" table like in your example, while certainly possible, is storing your data in a way that you cannot easily answer the questions you may want to ask of it. You are spreading your entity across multiple rows instead of putting it all in one row.
Think about how the data translates to an object in your domain. Let's call this table "order_state" ("order_events" is a bad name for two reasons: it's plural - which is just wrong - and it doesn't faithfully describe your entity). I'm going to assume "order" and "order_item" tables as well. These three tables create an aggregate root that can be used to derive an Order object in your domain. This Order will have, at the very least, 3 properties corresponding to your entity's attributes in your database:
id, items, state
where "items" is a collection of OrderItem and state is OrderState (see where I'm going here?). The question becomes about how the OrderState is populated. If your Order object represents the "context" in your FSM, then the OrderState is the base class for all states:
This means polymorphism is inevitable. There are 3 common ways to deal with polymorphism in a database, but using a transaction table (like in your example) is not one of them. For your problem, utilizing "single table inheritance" (where all objects in a hierarchy map to a single table row) makes the most sense because you are not storing different data in each child class, rather, just a marker to indicate which state is current (this is also generally the most performant solution). It also means you don't need to add any additional tables, because your "order_state" table can serve both purposes.
Using a transaction table, you are adding a 3rd orthogonal dimension to your aggregate (another set of primary keys), so your object graph becomes needlessly large and it becomes a chore to choose and hydrate your current state.
My last critique is in regard to the choice of putting your transition logic in the database itself. I just don't see how that can be useful other than to "seem" clean. What I mean is, somehow an order actually has to be shipped or an item has to be paid for, and adding a row to a database table doesn't do that. So somewhere else in your stack you must have duplicated your transaction logic in some fashion in order to facilitate the correct database calls when the events take place in your domain.
For example, using my above objects, when a the "pay" event is raised on your Order object "context", it will invoke the "pay" event on it's current state, AwaitingPaymentState, to undergo transition. In this scenario, there must be some method (an "action") that actually handles the payment. Either there is no state machine at all in your domain and you need to validate the state before processing the payment, OR you have a state machine like my example and you need to transition to the AwaitingShippingState. In either case, the logic is duplicated.
What you have created is a cool academic solution that may work (I don't doubt that it works), but this problem was solved 30 years ago and if this was done on my team, I would insist it be redone to adhere to the basic principals of the entity relationship model and database design.
Hi, thanks for taking the time to explain your ideas.
There are certainly many ways to model this problem in the database. But to me the main idea of the article is implementing a FSM as a user defined aggregate. This is technique is actually not mutually exclusive with an "accumulating snapshot" table, and in fact the trigger could be modified to maintain such a table instead or in addition to the order_events table.
I don't quite understand why you insist on not having a transactions (order_events) table. IMO it's very helpful for auditing purposes, and can be used to capture additional data for each state transition. E.g. you could easily put an IP column on it. From what I can tell you'd have to add one column per event for doing this with your accumulated snapshot table. Our application has ~10 events, so I'm not sure I'd like to have an explosion of columns on it.
Also, I think your solution falls apart completely when it's possible to re-enter a previous state. (or would you overwrite the first timestamp?)
Last but not least, there are many advantages to putting logic into the db. I know you object to the advanced analytics because they can be done differently when you control the DB schema from day one. But imagine you inherit a database with an events table that can't easily be modified to use an accumulating snapshot table. How would you write analytical queries against it? Another advantage of keeping logic in the database is reducing the number of queries (and network overhead) that are needed.
Anyway, you've made many good suggestions, but I think you've taken the toy example a bit too serious. It's very hard to come up with good real world examples without disclosing details of your actual application (which I can't in this case).
Anyway, using FSMs backed by user defined aggregates is definitely not an academic solution. This powers a mission-critical application of one of the worlds big companies for a database with over a billion records. It's also not the first iteration, and previous implementations at the application layer had caused problems that FSMs inside the DB have solved. Last but not least, this modeling the problem domain as FSMs has allowed the entire team (including non-technical people) to deeply and correctly understand how our analytics work.
Given the problem in the article, an accumulating snapshot table is simply the best solution for the reasons I laid out. If your actual system has different requirements in terms of storage and analysis (which apparently it does), I would be happy to hear about those and come up with an updated recommendation.
Along those same lines I'm not insisting you can't have a transaction table. If you need to record more information, like an IP address or user_id with each transition, then it would require a transaction table (although the article mentions none of this).
Honestly, the storage concerns I bring up (that is the schema) are really far less of a concern than the basic idea of storing your transitions in the database. Put simply, you are storing the wrong information. Instead of storing the actual state, you are storing a series of inputs that allows you to derive the state. This is a HUGE mistake. What if you add a new state? Say, [in_transit]. Every one of those 1 billion rows that are already in there will no longer be able to be processed correctly because they won't have the correct order of events.
This entire thing could be one table that stores the [order_id] [state] [timestamp] [is_current], with a stored procedure to update it that takes an "order_id" and "event_name". Instead, you have created a house of cards.
Our actual system has the requirements of an existing DB schema that can't be changed, that's already using an events table, and that needs to be analyzed. I wrote some more about it here [1]. The FSMs are also cyclic, which can't be handled by your approach as I pointed out earlier.
Anyway, coming back to the example from the post, if I change the FSM (e.g. add a state/event) I need to make sure all existing data conforms to it one way or another (by making the change backwards compatible or by migrating the data). But the same problem exists for any other invariant you're trying to enforce/add to your DB (FKs, Checks, etc.). It's just the cost of enforcing an invariant.
I'm also confused at this point how you jump between saying it's okay to have a transactions table and then later backtrace and say it's a huge mistake because it's not storing the actual state. I already pointed out that the two are not mutually exclusive, and that the trigger could maintain an accumulated snapshot in addition to the order_events table. In fact, the trigger could event derive the new state based on the latest value from the accumulating snapshot table which may also address some of your concerns about adding new states.
Also, I haven't created a house of cards. I had the novel idea of for leveraging user defined aggregates as finite state machines and decided to share it with the PostgreSQL community. I couldn't use my companies business domain as an example, so I had to come up with a bogus one that people would understand easily. Now don't get me wrong, I'm totally enjoying the discussion around this bogus example, but I'm not enjoying your unwavering faith in the absolute correctness of your ideas and the HUGE mistakes you claim I'm making. So is is probably my last reply unless you're cool with toning it down a bit.
If you are bound to an existing database schema then there isn't much you can do (depending on how bound you are of course). That's a totally valid reason to choose your current path. I'm not naive to business requirements. Often, systems end up being far from theoretically perfect due to legacy {insert component of stack here}. I get it.
So ruling out a snapshot table (I think you have made clear this is not an option), your best choice would then be a transaction table where your [state] is a Type 2 Slowly Changing Dimension (SCD2) with an indicator field for the current state. I want to be clear here: you should be storing your state, NOT the events (arguments) that can be used to derive your state. I cannot express this enough. Your problem is not unique. You simply cannot be naive enough to think you are the first person to want store the state history of an FSM in a database. This is literally data warehousing 101. The novelty of your solution stems from the fact that you are (dangerously) misunderstanding the problem. If, for some reason, you need to store what [event_name] caused the transition to the [state], just add a field. Simple.
As for invariants. Let's use an example: say you add a "package" event that transitions to an [in_transit] state between [awaiting_shipment] and [shipped] (note here we may also then want to change "ship" to "deliver"). This simple and totally reasonable change would make every order with an event queue: "create", "pay", "ship" result in [error]. If you just stored the [state], then a [shipped] order will remain [shipped] regardless of what series of events brought it there - which would be correct. There would be no need to go back and change historical data - nor would you want to. How would that even be done? Manufacture shipping information and timestamps? There are no invariants in this process (again, I want to reiterate that this is a solved problem). True FSMs (if this is actually modeled as one) only care about the current state anyway and maybe the previous state (see SCD3). What I mean is, future actions are determined using the current state. If the current state becomes [error] for 1 billion orders, that's a problem. A huge problem.
"Absolute correctness" is an interesting term... and I agree that I am probably coming off a bit aggressive in my comments. Look, I'm not trying to be a bully here, and I DO think that the FSM you have created is interesting, unique, and a cool example of how user-defined aggregates can be used. The problem with it is, unfortunately, an extremely common one. If I had a nickle for every time a programmer jumped into the world of databases and came up with a "new" solution to a "never-before-seen" problem... I'd be a wealthy man. Relational databases have been around for a long time and are one of THE MOST studied and written about pieces of infrastructure/software in existence (they're interesting for so many reasons!). What this means is that it's EXTREMELY unlikely you have found a problem that hasn't already been "solved" (of course there can be wiggle room) in a theoretically correct manner. It's almost always just an individual not recognizing (or breaking down) the problem efficiently.
And that's what I am seeing here: a problem that is present and handled in LOTS of databases (I've personally done this dozens of times), but is being approached from a less-than-perfect (albeit unique) angle. I can't say that I'm "absolutely correct"... I don't have the full grasp of your domain and business requirements. I don't know precisely the degree of constraint your schema is imposing. I don't know how well communication between teams and management of your stack is handled (yes, these can be important - you did mention you had problems in these areas). Maybe yours really is the best solution - though I remain unconvinced. But I can say, that given what I know about the problem, the entity relational model, database design, and data warehousing that your solution is simply not sound. It's difficult to put that nicely. I don't mean that your wrong (apparently this is working for you). It's not black and white. Just that if one of my clients came to me with similar requirements, I would follow "the book", because this has already been written.
And if you want a transitions table for your FSM to enforce explicit constraints (I don't recommend this, but it's MUCH better than hiding this data in a function/trigger):
So let me explain in more detail how you could solve this without all of the overhead you have created. Assuming a true FSM, there is no need to store the state as a value in your database (e.g. "awaiting_payment" in a [state] field). Instead, you should be storing your states as BIT fields:
[awaiting_payment], [awaiting_shipment], [shipped], etc
where "1" indicates the current state and every other state field is "0". In addition, you will need a nullable timestamp for each state:
[awaiting_payment_timestamp], [awaiting_shipment_timestamp], etc
to store when each state was entered, if at all. In this way, every order is in a single row and your analytics become trivial queries that don't require any complex joins or correlated sub-queries. Additionally, using simple constraints on each state field (e.g. the [awaiting_shipment] field <> 1 if the [awaiting_payment_timestamp] is NULL) would remove the need for your trigger and, again, reduce overhead.
This is called an "accumulating snapshot" table, and it is used in data warehousing to solve the exact problem you are dealing with here - tracking an entity through a series of predetermined events to monitor it's status and mark milestones. Using a "transaction" table like in your example, while certainly possible, is storing your data in a way that you cannot easily answer the questions you may want to ask of it. You are spreading your entity across multiple rows instead of putting it all in one row.
Think about how the data translates to an object in your domain. Let's call this table "order_state" ("order_events" is a bad name for two reasons: it's plural - which is just wrong - and it doesn't faithfully describe your entity). I'm going to assume "order" and "order_item" tables as well. These three tables create an aggregate root that can be used to derive an Order object in your domain. This Order will have, at the very least, 3 properties corresponding to your entity's attributes in your database:
id, items, state
where "items" is a collection of OrderItem and state is OrderState (see where I'm going here?). The question becomes about how the OrderState is populated. If your Order object represents the "context" in your FSM, then the OrderState is the base class for all states:
AwaitingPaymentState, ShippingState, CancelledState, etc
This means polymorphism is inevitable. There are 3 common ways to deal with polymorphism in a database, but using a transaction table (like in your example) is not one of them. For your problem, utilizing "single table inheritance" (where all objects in a hierarchy map to a single table row) makes the most sense because you are not storing different data in each child class, rather, just a marker to indicate which state is current (this is also generally the most performant solution). It also means you don't need to add any additional tables, because your "order_state" table can serve both purposes.
Using a transaction table, you are adding a 3rd orthogonal dimension to your aggregate (another set of primary keys), so your object graph becomes needlessly large and it becomes a chore to choose and hydrate your current state.
My last critique is in regard to the choice of putting your transition logic in the database itself. I just don't see how that can be useful other than to "seem" clean. What I mean is, somehow an order actually has to be shipped or an item has to be paid for, and adding a row to a database table doesn't do that. So somewhere else in your stack you must have duplicated your transaction logic in some fashion in order to facilitate the correct database calls when the events take place in your domain.
For example, using my above objects, when a the "pay" event is raised on your Order object "context", it will invoke the "pay" event on it's current state, AwaitingPaymentState, to undergo transition. In this scenario, there must be some method (an "action") that actually handles the payment. Either there is no state machine at all in your domain and you need to validate the state before processing the payment, OR you have a state machine like my example and you need to transition to the AwaitingShippingState. In either case, the logic is duplicated.
What you have created is a cool academic solution that may work (I don't doubt that it works), but this problem was solved 30 years ago and if this was done on my team, I would insist it be redone to adhere to the basic principals of the entity relationship model and database design.
Does this make more sense?