Hacker News new | past | comments | ask | show | jobs | submit login

No post on state machines would be complete without mention of the superset concept of Petri nets [1]. I have used Petri nets to build my own workflow engine previously. Building a workflow engine is one of those things... nearly every enterprise app has one, and normally you shouldn’t need to build one if you can source one... but it is an exercise that makes you a stronger engineer/developer too, deepening understanding of the patterns that lie beneath most systems.

Back when I was building, openly available workflow engines, BPML, BPEL etc. were not really a thing yet, so it made a little more sense to be building one.

[1] https://en.wikipedia.org/wiki/Petri_net




This is my first encounter with the Petri nets, and at first I thought they were just like state machines, in that you're only concerned with flow of control... boy was I wrong. Its going to take a while to fully grok, but it seems impressively powerful in what it can do. Thanks!


Amusingly, I was warned away from using Petri nets as the execution model for a workflow manager because "you can't prove the net will ever complete". Since then I've used several large-scale petri net-based workflow engines that ran large-scale industrial computing projects without problems, so I think that complaint was excessively theoretical.


That is funny, appreciate the anecdote. I have encountered resistance to application of lesser known ideas in daily practice in my life, because even as engineers, sometimes there is the fear that we are veering off into esoteric-land. Especially in a corporate setting, getting the balance between true engineering and the science involved and "go fast and break things" is tough.

I have always felt that the key to being an engineer is to understand that theory is a tool, and what makes it fact is science (wow, cool science! LOL) - testing the theory by putting it into practice and measuring the result/iterating. The fact that Petri nets give you a way to talk about the proof as well as model the problem is part of what makes them interesting to me.

So I guess I'm saying, moving fast while applying science.. and the learning and iterating.. that just makes sense right??? If the result is a working system, even better! Science and being agile are totally compatible!

The burden of proof is on everyone, to say, OK.. I am following this technique because it helped me build a working system. I am standing on the shoulders of giants, and the system has been tested and works... but if you don't think the theory is correct, then please, prove it wrong. Make it better, improve, iterate!

In this case, with a genuine and positive attitude, I kindly turn the computer keyboard around to the other person and ask them if they know a better way, to please explain it and I'm happy to learn and adapt. I'll show my proof if you show me yours, and we can discuss, kind of thing.


They are great! My bachelor thesis was implementing an algorithm to generate minimal petri nets based from an infinite partial language, described in something like a basic Regex.


Do you have more information on how to build workflow systems, and why is it good to do so with Petri nets rather than abstract the processes with classes or whatever else?

Any information is welcome


I'm one of the original creators of temporal.io open source project.

Long time ago I worked with an internal Amazon workflow engine that was based on Petri nets. It worked, but I learned that for the majority of the business level workflows it was too low level. It was always a hassle to map a real use case to it.

The AWS Simple Workflow service was created as a replacement of that engine. It greatly elevated the lel of abstraction. The Azure Durable Functions, Uber Cadence, and our latest project temporal.io are all use this new way to represent workflows. The basic idea is to not implement the business logic using an intermediate representation. Instead rely on runtime to make any code state fully fault tolerant. Imagine that the full orchestrating program state (including threads, local variables and even blocking calls) is preserved across any process and other infrastructure failures. This allows to write production code like this:

      boolean trialPeriod = true;
      while (true) {
        Workflow.sleep(Duration.ofDays(30));
        activities.chargeMonthlyFee(customerId);
        if (trialPeriod) {
          activities.sendEndOfTrialEmail(customerId);
          trialPeriod = false;
        } else {
          activities.sendMonthlyChargeEmail(customerId);
        }
      }
As the state is fault tolerant the 30 day sleep is absolutely OK. And if any of this operations takes long time (for example because a downstream service is down for a day) the code doesn't need to change. I bet that any Petri nets based version of the above code would be much more complicated. One of the reasons that large part of the complexity of the business level workflows is not in sequencing of actions (which Petri nets solve), but in state management and argument passing (which Petri nets don't solve at all).


Thank you for adding this flavor to the discussion. I do agree that Petri nets are often seen as too complex for basic business process modelling, which is why languages such as BPEL and BPMN were invented by OMG before, and then simplified into UML activity diagrams, for example.

That is to say, level of complexity in describing process in Petri net "modelling language" might seem higher. I can surely see how it feels more esoteric and not always clear how it ties to the implementation.

Many workflow engines take the approach you're describing, for example, I also used to use Microsoft's Windows Workflow Foundation to do similar things. Essentially you're sketching a workflow process skeleton and then managing the state atomically by compartmentalizing it, so to speak.

Actually, this is exactly what Petri nets propose - state is defined by tokens in places - i.e. compartmentalized.

I don't entirely agree with your comment about state management and argument passing in Petri nets. I do agree it takes digging around to find tangible examples/applications that cover argument passing, but the idea of "tokens in places and how they enable transitions of state" is the part of the puzzle to represent in an abstract way, the tiny pieces of state/arguments that enable transitions to fire. I could represent your code above as transitions which cannot fire until tokens representing the state of your conditions were present in the right places. For example, the passage of time was present in an input place, and the condition trial period = true and value customer ID is not blank, all as tokens that have to be in input places to enable the transitions to fire which trigger those activities.

This is to say that, I agree that representing that graphically using Petri net modelling may not be as business friendly as say, UML activity diagramming. But it also doesn't make the simpler approach any less a subset of what you can do with Petri nets, as it very much is one.

But definitely agree, use the right level of abstraction that fits the need, like the old adage, try to use the right tool for the job.

I'd argue that, the tool you describe could be modeled as a Petri net, but that perhaps you may not wish to have a user do it that way.

Do you agree or have a different opinion?


> I also used to use Microsoft's Windows Workflow Foundation to do similar things. Essentially you're sketching a workflow process skeleton and then managing the state atomically by compartmentalizing it, so to speak.

temporal.io and Windows Workflow Foundation are completely different. WF uses code to generate an intermediate representation of the workflow AST. And that intermediate representation is executed. It is similar to the most existing workflow engines from Airflow to BMPN ones.

Temporal doesn't use any intermediate representation. It directly executes the code keeping all its state durable.

> I don't entirely agree with your comment about state management and argument passing in Petri nets.

When I say about the state I mean the overall state of the business project which is much more than a token in a certain dish. For example such state can include a list of line items to order with all associated attributes and counters of processed events.

> I'd argue that, the tool you describe could be modeled as a Petri net, but that perhaps you may not wish to have a user do it that way.

All software at the lowest level can be modeled as a state machine. And Petri net is one of the possible representation of FSN. I'm interested in simplifying the way the workflow logic is specified by an engineer. At this point I believe the temporal.io approach of giving your the full power of a high level programming language and taking care of durability and scalability is the best option.


NOTE: I am not associated with temporal.io in any way, just learned about it frankly.

I just had a look through your docs and I understand the approach you're taking... similar to "infrastructure as code" concept applied to simplifying workflow. What you are saying makes sense to me. I haven't heard of temporal.io before (cool, hi!), but I get the concept of trying to eliminate the intermediate representation (which I suppose also could be seen as a bottleneck in a fully distributed system and to the "workflow" of a software engineer).

I would point out that, Petri net is a theory and modelling language for proving correctness/soundness of a model - and as I mentioned earlier, I agree that if you don't need that for your use case, yeah - it could be too much. I would like to make it clear for other readers that Petri nets are not specifically the implementation of a technology, they are a modelling technique/concept which could also be implemented/executable through an engine. Having, or not having, an intermediate representation in compiled code or runtime has nothing to do with whether or not you want to model your FSM, or graph, or network of processes, logic and state as a Petri net.

My goal in originally posting the comment was just to share the superset theory of Petri nets, as I don't often see people bring it up in discussions on workflow, FSM, etc.

The comment that Petri nets are one possible representation of FSM is true, with the key difference being that FSM are for single threaded operations and have some limitations in that regard, whereas Petri nets, as the superset, also handle concurrent operations - and workflow is a common use case where we see concurrent operations (but certainly not the only one, and industrial control, etc. is certainly more complicated that organizational workflow). I think this deck is kind of handy for describing some of the differences between FSM and Petri nets [1]. Interestingly, we know that dealing with concurrency, parallel processing, multithreading is difficult, which is why any tools to reduce some of that complexity from day-to-day coding (ex. coroutines/async/await, workflow engines, etc.) make the engineer's life easier.

When we were talking about compartmentalization/encapsulation of state, I see how conceptually you are abstracting that away from the engineer in temporal. I suppose, what I meant when I was talking about WWF was, the fact that the engineer's code and their own state could be "compartmentalized" into the pluggable definitions of the activities provided by the workflow engine. It seems like your team has kind of inverted that a little bit, but roughly the idea is similar. You appear to have the engineer "write the workflow as code" and use a wrapper around the activity to ensure the state management and simplify it for the engineer. A slight difference but interesting paradigm.

I would say, I believe that, while rarely done - one could define WWF workflows completely through code and that there is a way to serialize workflow as an artifact that could be version controlled, etc. But, how they display that visually and how WWF compiles into the intermediate representation and uses that at runtime does sound different than temporal. I suppose though, having the intermediate representation does perhaps allow for running tests against the workflow logic itself - whereas you shift the tests to the code itself.. so, different implementation but similar idea.

Again, I see how temporal simplifies that from the engineer's perspective, so kudos to you guys. I can see how what you are saying would work and be a helpful approach for a lot of use cases.

[1] https://www.cs.ucdavis.edu/~devanbu/teaching/160/docs/petrin...


There are two components to your question - the practical side, "how do I build a workflow system?" and then, "why Petri nets?".

If you really want to build a generic workflow engine, I think the way is to identify the pattern they follow and implement that. Build classes, attributes and methods that abstract away automating a process in a generic sense (e.g. turn workflow into one or more services) and then integrate those service(s) into your application as the glue that holds together complex processes. It's obviously harder than that one sentence, of course there are lots of nuances, but that's probably the simplest way I can think of to explain the high-level hand-wavy approach to doing this if you want to build it yourself, without going into all the details.

If you understand the pattern, and would rather not build, but buy or obtain an open source workflow engine/management system, many exist [1][2]. Most enterprise applications (e.g. Oracle, SAP, Salesforce, etc. etc.) have such workflow tools built in already.

Regarding the second part of your question, "why Petri nets?", well, one could argue that any system that automates a process in any way is a kind of workflow management system/engine or could be modelled as a "Petri net". I guess you could say, "why care about patterns in software engineering" then? The difference is, like many patterns, Petri nets give you a tried/tested technique for conceptualizing and modelling the system/process, validating its correctness, and in some cases, tools/engines that implement these concepts can even give you the executable framework for building too. You know, standing on the shoulders of giants and that kind of thing.

Technically, Petri nets are a superset for modelling/designing/visualizing/validating what you can do in a workflow system. Also, to be clear, the concept of Petri nets is in every way compatible with/related to object oriented design, and even functional and procedural programming paradigms.

Perhaps I can try to answer your question in this way.. by relating Petri nets to design patterns in software engineering (think, Gang of Four, Patterns of Enterprise Architecture, etc.). You know how, when you're developing software - over the years, you start to see patterns emerge? The best way to understand Petri nets and how they relate is to see what problem they solve in work you might have done yourself.

If you have ever implemented a large chunk of any information system, you start to realize, that even though it might be comprised of many smaller components working together, ultimately, it had to enable some kind of overall process to function. And how, there were certain core components that you realize are more infrastructural in nature and are re-usable? For example, logging, security (AAA, authentication, authorization, access control), etc.?

I was exposed to workflow systems when I first started working on case management systems (for example, legal cases). Case management is a scenario that includes, coordinating multiple child processes to build a "case", which is an instance of a process that is being executed. Think.. Case ID #123 is an instance of a distributed process, which results in a case file, but may have many independent and ancillary sub-processes, approvals, communications, notes, reviews, etc. that must come together to "complete" the case.

You might diagram the flow of that legal process in a "flow chart". Business people love flowcharts right? The next logical step is to think of.. what if these flow charts could represent/become computable models (e.g. math/validity/correctness) and perhaps even a shell of executable code?

You could construct an entire monolithic case management system, that handled all the work related to legal cases - by learning the business process, hard-coding the logic from that flow chart (classes with attributes and methods that define the full behavior of the system, interactions, etc. if we were doing OOP), and so on. If you did that, and you did that reasonably well, that system would certainly work for its designated scenario (legal cases).

In the midst of doing that, you might have realized that, the IT ticketing system you bought for your organization, had a similar process (for example, case management in ITIL). You might notice.. startling similarities between the basic design of that system and the legal case management one.. to the point where you start asking.. is there a pattern here?

Sure enough, there is one. The basis for that pattern is what some call workflow patterns - Professor Wil van der Aalst and Professor Arthur ter Hofstede being two influential thinkers in the area [3]. If we follow this thread, we find one helpful paper: "The Application of Petri Nets to Workflow Management" [4] ,which I think would be of interest.

Then we start to realize that, some processes don't happen in a monolithic way. They need to be "consistent", but the underlying code and execution could be distributed in different services or different systems and yet our process needs to pull it altogether to carry out the work of the process. We might see that even though distributed, these processes and systems could be represented holistically. Petri nets help in this situation.

I think one thing people might have missed when we got into this whole world of federated, independent teams, service-oriented architecture and now microservices, was that - ultimately, systematic behavior and processes must still function, and function well enough for organizations to fulfill their purpose. While self-emergent behaviors are entirely possible and arise all the time, in complex systems.. when you absolutely must make your system/process work, you need a way to engineer that and Petri nets give you a way to think about, model, and validate/reason about that.

For a current very relevant example, consider microservices today. On their own, they provide little value - but when orchestrated and working together to compose a larger system, to enable human and machine processes to execute and thrive, it would be nice to have ways to think about and model those processes that live on top. Petri nets are one way to conceptualize and visualize those processes, where that conceptual model can be proven to be sound mathematically, and even turned in some cases directly into an executable framework within which to plug in your own code. That's what workflow systems often do, for example.

Another relevant and recent case - event sourcing.. and event-driven mechanisms, and event handling, and the processing that goes along with it to essentially create a "directed process" - these are all directly related too. Also, there's a reason that data workflow automations such as Apache Spark or Airflow, for example, chose directed acyclic graphs (DAG) as models for complex, reliable, distributed process execution - and you can represent those as Petri nets too.

What I always liked about Petri nets when I discovered them, was it gave me a way to link together strict state management, with process control, based in logical/provable theory - but also gave me a way to bridge between the "human" side of process ("flow charts") and the technical side of things. It gave me a framework to "plug in" my different systems, functions, services, objects and behaviors into something that coordinated complex process both in theory and in practice.

I do not believe that one must implement monolithic process in code - as there is also emergent behavior - but, Petri nets and workflow management systems give you a way to think conceptually about how your system works.. and even potentially to build the "glue" that puts your system together, if you chose.

By the way, one interesting thing I saw that came out of this work, was simulation software where you could use Petri nets to essentially "run" a process in a simulated way and see if there were any natural bottlenecks in the process that you could optimize in advance - before building any code at all. Process simulation [5] is another whole related rabbit hole to go down, but a fascinating one!

[1] https://en.wikipedia.org/wiki/Workflow_management_system

[2] https://github.com/meirwah/awesome-workflow-engines

[3] http://www.workflowpatterns.com/

[4] https://www.semanticscholar.org/paper/The-Application-of-Pet...

[5] https://en.wikipedia.org/wiki/Process_simulation


Thanks for the long and thoughtful answer, perhaps my biggest remaining doubt is whay PNs instead of graphs and state machines, but I guess I should start reading some of the links you posted to discover more of its potential as I see there are many uses and properties which are easy to miss out.


Think of it like this - please, go ahead and use graphs and state machines. If you would like to model the operation of those graphs or state machines, one way to do it is to model them using Petri nets. And if you use Petri net modelling, you get some nice properties that have mathematical proofs behind how it works. And they might be handy for working out logical problems with your graph or state machine due to how they are constructed- which you discover based on the language Petri nets give you for talking about the process.

Like, set logic in relational modeling.. if you abide by that technique, set logic gives you certain guarantees about operational characteristics that have mathematically proven grounding. It gives you primitives for talking about set operations, like union, difference, intersection, join, etc.

Petri nets are a modelling technique. Activity diagrams (UML) are another way of expressing similar processes, but they don't necessarily have the mathematical grounding inherent in the way you model Petri nets.

Perhaps yet another way of saying it is that Petri nets are closer to a visual modelling language of execution that shows more of the logic of the concerned process or system.


Thank you very much for writing this extremely insightful post.


Thanks for the wikipedia link. Do you have any recommendations for books or papers to read? Not just for state machine and Petri nets, but also any other software models you would consider useful to learn and potentially implement.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: