Show HN: DBOS transact – Ultra-lightweight durable execution in Python

jedberg · 2024-09-10T16:07:48 1725984468

Hey all, I'm excited to be the new CEO of DBOS! I'm coming up on my one month anniversary. I joined because I truly believe DBOS is solving a lot of the main issues with serverless deployments. I still believe that Serverless is the way of the future for most applications and I'm excited to make it a reality.

Ask me anything!

bb01100100 · 2024-09-10T21:06:07 1726002367

Would it be correct to say the these client libraries provide the functionality (eg ease of transactions, once only, recovery) whereas your cloud offering solves the scaling / performance issues you’d hit trying to do this with a regular pg compatible DB?

I do a lot of consulting on Kafka-related architectures and really like the concept of DBOS.

Customers tend to hit a wall of complexity when they want to actually use their streaming data (as distinct from simply piping it into a DWH).. being able to delegate a lot of that complexity to the lower layers is very appealing.

Would DBOS align with / complement these types of Kafka streaming pipelines or are you addressing a different need?

KraftyOne · 2024-09-10T21:22:02 1726003322

Yeah exactly! The Kafka use case is a great one--specifically writing consumers that perform real-world processing on events from Kafka.

In fact, one of our first customers used DBOS to build an event processing pipeline from Kafka. They hit the "wall of complexity" you described trying to persist events from Kafka to multiple backend data stores and services. DBOS made it much simpler because they could just write (and serverlessly deploy) durable workflows that ran exactly-once per Kafka message.

rtcoms · 2024-09-11T07:17:34 1726039054

Recently I came to know about https://www.membrane.io/, which also follows similar approach, but it looks like that is more for internal apps and small projects.

How would you compare DBOS with that ?

jedberg · 2024-09-11T17:09:37 1726074577

From a high level what we offer is similar -- durable and reliable compute.

There isn't a lot of public information about how they are built, but from what I can tell you're right -- their architecture is more oriented for small projects.

It looks like they store the entire JS heap in a SQLLite database. We store schematized state checkpoints in Postgres compatible database, which makes it so that we can scale up and allow interesting things like querying the previous states and time travel debugging, where you can actually step though previously run workflows.

ashwindharne · 2024-09-10T20:56:40 1726001800

I've been using Temporal recently for some long-running multi-step AI workflows -- helps me get around API flakiness, manage rate limits for hosted models, and manage load on local models. It's pretty cool to write workers in different languages and run them on different infra and have them all orchestrate together nicely. How does DBOS compare -- what are the core differences?

From what I can tell, the programming model seems to be pretty similar but DBOS doesn't require a centralized workflow server, just serverless functions?

KraftyOne · 2024-09-10T21:09:09 1726002549

Co-founder here:

Great question! Yeah, the biggest difference is that DBOS doesn't require a centralized workflow server, but does all orchestration directly in your functions (through the decorators), storing your program's execution state in Postgres. Implications are:

1. Performance. A state transition in DBOS requires only a database write (~1 ms) whereas in Temporal it requires a roundtrip and dispatch from the workflow servers (tens of ms -- https://community.temporal.io/t/low-latency-stateless-workfl...). 2. Simplicity. All you need to run DBOS is Postgres. You can run locally or serverlessly deploy your app to our hosted cloud offering.

sim7c00 · 2024-09-10T20:48:28 1726001308

it might be interesting to look at some standard for workflows like CACAO to express what a workflow is. that way, workflows can ultimately become shareable between such workflow execution engines, and have common workflow editors. its (in cyber) a big problem that workflows cannot be shared between different systems which adds great costs to implementing such a system (need to redesign or design all workflows from the ground up). I think workflows and easy editors to assemble and connect steps are a good step ahead in any automation domain, but everywhere people want to reinvent the wheel of expressing what a workflow is.

definitely a fan of what these types of systems can do in replay/recovering and retying steps etc. as well as centralizing a lot of didferent workloads to a common execution engine.

qianli_cs · 2024-09-10T16:02:59 1725984179

Hello! I’m here to answer any questions. I’d love to hear your feedback, comments, and anything!

evantbyrne · 2024-09-10T20:57:15 1726001835

Looks like an interesting abstraction. I can see the usefulness because I had to create a poor man's version of this when I built a CD. Sorry, because I don't have time to watch the 50 minute video, but how are you guaranteeing durability? Are you basically opening Postgres transactions for each step, or is there something else going on to persist state?

qianli_cs · 2024-09-10T21:05:35 1726002335

Yeah! Under the hood, DBOS wraps each function (step) to log its output in the database. This ensures that workflows can be safely re-executed if they're interrupted, guaranteeing durability.

More info here: https://docs.dbos.dev/explanations/how-workflows-work

hmaxdml · 2024-09-12T20:07:02 1726171622

Can you give us some more details about the CD pipeline you built? :)

quickvi · 2024-09-10T20:18:52 1725999532

Is it possible not to use a PostgreSQL database? For example would it run with SQLite? The goal is to improve developer experience.

sitkack · 2024-09-10T22:03:47 1726005827

You can now run Wasm builds of PostgreSQL that will get you everything you like about SQLite.

https://github.com/electric-sql/pglite

qianli_cs · 2024-09-10T20:32:15 1726000335

Co-founder here! No current plans to support SQLite. We picked Postgres because of its huge ecosystem--you can use DBOS with any PostgreSQL-compatible database (Supabase, Neon, Aurora, Cockroach...) and with any Postgres extension (here's an example app using pgvector: https://github.com/dbos-inc/dbos-demo-apps/tree/main/python/...)

threecheese · 2024-09-10T22:06:20 1726005980

(After a quick look at the code) Is this due to concurrency (writes)? It looks like this architecture supports multiple executors, and I would imagine you require transactional guards to ensure consistency. I really like this interface btw, the complexity is hidden very well and from reading your docs it remains accessible if you need to dig deeper than a decorator.

And how the heck are you maintaining Typescript and Python copies? lol

qianli_cs · 2024-09-10T22:27:09 1726007229

Thanks for your kind words! We're focusing on Postgres because an important scenario for durable execution is serverless computing, which won't work with an embedded database.

threecheese · 2024-09-14T20:06:56 1726344416

I am sure you are aware of this, but if not: there are some emerging technologies around embedded database scale-out using CRDT and other replication protocols that would support various “serverless” (as in decentralized) topologies. PGLite, sqlite-cr, libSQL et al. I am informally looking at for serverless executor agents that do not need to coalesce around a central database instance (“server-full”). I am sure you tested something like this, I would guess that classic/CDC replication lag would throw a big wrench into an attempt to orchestrate disconnected remote executors, I am hoping that in a peer to peer topology this new tech will have low enough sync latencies to be useful. Best of luck with DBOS! You have an amazing team.

snicker7 · 2024-09-11T03:19:46 1726024786

How does the DX compare against AWS step functions? My experience is that it is very difficult to “unit test” step workflows.

hmaxdml · 2024-09-11T04:24:03 1726028643

Step functions are an "external orchestrator". With DBOS your orchestration runs in the same process, so you can use normal testing frameworks like pytest for unit tests. Its super easy to test locally.

jedberg · 2024-09-11T04:14:24 1726028064

We have a time travel debugger that makes it super easy to test workflows. You could set them up in test and then time travel them, or even time travel completed workflows in production.

https://docs.dbos.dev/cloud-tutorials/timetravel-debugging

darkteflon · 2024-09-11T12:24:13 1726057453

This looks quite cool. If anyone from DBOS is still around: does this handle more complex dependency relations between workflow steps (e.g. directed graphs), or is it only suitable for linear workflows?

chuck_dbos · 2024-09-11T13:13:20 1726060400

I'd recommend using child workflows for directed graphs. The building blocks are start_workflow to split off a child, and then you can wait for the result at a later time. Workflows can also send events / messages to communicate back and forth with each other.

One neat thing about starting a child workflow is you can assign an idemopotency ID, which might be intentionally calculated in a way such that multiple parents will only start one run of the child workflow.

hobs · 2024-09-11T00:34:56 1726014896

Very funny to me how much everyone went away from the db to achieve idempotent behavior, and now we're back to just using a db as a complicated queue with state.

snicker7 · 2024-09-11T03:16:47 1726024607

It turns out that DBs were invented to solve hard problems with state management. When people moved away from DBs (at least transactional relational DBs) they had to rediscover all the old problems. Tech is cyclical.

hmaxdml · 2024-09-11T03:34:22 1726025662

One of the motivation for DBOS is that OSes were designed with orders of magnitude less state to manage than today. (e.g. linux >30 years ago). What's made to manage tons of state? A DBMS! :)

catzapd · 2024-09-13T14:14:14 1726236854

Recovering the application from failures especially when updating multiple data sources, once and only once execution and such things are in the application domain. They have never been done by relational databases. That is the problem solved by the Python SDK of DBOS ( and typescript SDK)

jedberg · 2024-09-14T17:18:40 1726334320

It's sort of a combination of both. The library solves those problems by storing specific data in the database and then taking advantage of the database's ACID properties and transactions to make the guarantees.

Then the DBOS cloud platform optimizes those interactions between the database and code so that you get a superior experience to running locally.

rtcoms · 2024-09-11T06:56:24 1726037784

Would I be able to use all the python and npm packages with it. Would something opening a headless browser to scrap data work with DBOS ?

hmaxdml · 2024-09-11T16:37:42 1726072662

Yes, its normal Python/node.js so you an use all their packages.

We know of users running puppeeter to scrap data.