Hacker News new | past | comments | ask | show | jobs | submit login
Double-Entry Ledgers: The Missing Primitive in Modern Software (pgrs.net)
55 points by pgr0ss 6 days ago | hide | past | favorite | 18 comments





I've lost count of the number of clients/projects where I had to implement double-entry ledgers. The surprising thing is that no two were the same. Almost in every case there were some specifics that were novel. And trying to determine the current balance of an account when all you have is thousands of prior transactions is tedious at best. In the old days, there was a process called closing off which was effectively a checkpoint in time. The most convoluted example I suffered through was a ledger handling multiple currencies with a mixture of spot and forward contract exchange rates.

Obviously there are novel parts, because the transactions that alter balances are different in every application, and each has its complex "business" logic: undoing a reversed sale is completely different from undoing a mass of accumulated incorrect VAT amounts.

The idea is that the ledger implementation itself is generic and shared, and then how you use the ledger is the novel part.

My idea is that a reusable ledger, as a service or library, would be too distant from actual needs to provide much value.

Consider the example database state in the article:

Transfer ID Description ┃ user receivables available 1 order created ┃ -$10 $10 2 payment received ┃ -$10 $10 3 partial refund ┃ $5 -$5

A real application cannot go very far with this kind of "description": every type of transaction needs specific data, whose complexity dwarfs the double-entry ledger mechanism.

For example "order created" transactions need a reference to an order with details of what has been ordered, "payments" need a reference to a collections of debts that they are meant to pay and details of how they have been paid, "refunds" are actually many types and need a reference to what they are refunding.

Real accounts are also much more complex than a history of changes and a computed balance: for example, some of them need to be included in accounting calculations and reports in various roles, with largely arbitrary relationships and accounting criteria, while some are linked to real-world persons.

Comparing the two ledger implementations that the article references is instructive: Pgledger has simple ID columns for accounts and transfers, and it is up to the application to use them as foreign keys from the interesting tables, while TigerBeetle also offers some generic "user data" and "code" columns with arbitrary data and suggests having a separate "control plane" database.


My idea for for this to add metadata to the transfers, likely with a jsonb column. But I agree that the ledger won't encompass everything. You would still have other database tables and then you'd store domain identifiers in the ledger and/or ledger identifiers in your other tables.

Maybe I'm not understanding it, but don't you keep a record of the current balance all the time? And only need to go through the transactions in order to verify the ledger invariants, which I guess would only happen fairly rarely?

I guess, but often, or often enough, you don't just need the balance now, but also then.

Makes sense. It seems like one would need a wrapper like "getBalance(accountName, dateTime)" which would abstract iterating over the ledger to find the balance at a certain time.

But I could see this being a performance hotspot - looping over rows and tallying a running sum across many accounts seems like a waste of cycles.

What's the alternative to the "what was the balance at date/time" in non-ledger based systems?


shouldn't the balance be calculated and and compared with the actual balance after every transaction? i have an account where money went missing. the only way to find out when that happened was to have a record of the actual balance after every transaction.

Problem is solved with 4 timestamps.

Great article. I love ledgers too. While I've only used one for personal accounting, this makes it clear there are a lot of other practical use cases.

Other commentor brings up a good point though - how do we efficiently get the balance of a certain account at a certain date time? Does using a ledger require constantly summing up columns? What happens when we have 1M, 10M, or 100M+ records?

Maybe the ledger would need to be broken into separate "completed" or "sealed" groups once we verify the balance is zero, so like the other commenter mentioned, we could have "checkpoints" where the balance is known to be zero and can calculate from that point forward, instead of from record 0.


In manual system, we used to have a "page total" for a single page of a ledger/book so you just had to query last page total and bring that forward. In immutable ledgers, if you have a running total, the last balance is your current one.....

And you are certain about going 1 month (or period) going back that at a particular date, at he last entry the balance was something.

Same as in paper books. Just see page total of that particular date....


> how do we efficiently get the balance of a certain account at a certain date time?

This depends on how you implement the ledger. For pgledger, I store the balance in every entry, so to find a historical balance, you just need to find the most recent entry before that time:

https://github.com/pgr0ss/pgledger/blob/df5541dcf25f416a6a24...

I have a mental TODO to add a query function to make this simpler.


I updated the README and added an example test: https://github.com/pgr0ss/pgledger?tab=readme-ov-file#histor...

That makes sense and is an elegant solution. This way one can check the balance at any time without recalculating, but we still have an audit log to find discrepancies when the balance is expected to be zero but is not.

Depends on the ERP. Some store a running balance in the account ledger; others sum and cache the results until the data changes; and the some actually sum everything every time because accuracy matters more than speed.

Generally, the most practical way to do is to "finalize" a number periodically (monthly and annually) when the books for the prior period have been "closed" (locked from further edits) and then sum only transactions taking place since the last closed balance. Most major ERPs do this.


If SQLite supported WASM UDFs...

So essentially a double entry ledger is a state machine with the movement between every state being recorded.

Is this a good abstraction?




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: