Building a Scalable Accounting Ledger

aryehof · 2024-03-02T08:44:26 1709369066

This claims 1 and -1 is confusing, then proceeds to use the confusion. Why not use the language of the problem domain?

The rules are unambiguous. An increase in assets are a DR, increase in liabilities a CR, in income a CR, in expenses a DR, in owners equity a CR.

Decreases the opposite.

Using 1 and -1 is nonsensical because their meaning in mathematics is fixed, while as shown by the rules above, in accounting DR/CR meanings change.

So a couple of example journal entries…

  Product sale for cash
  Sales CR 100
  Bank  DR 100

  Payment of rent
  Rent DR 50
  Bank CR 50

  Sale for cash where tax collected and owed
  Income      CR 70
  Tax Payable CR 30
  Bank        DR 100

  Monthly petty cash tin entry
  Postage             DR 5
  Accounts receivable DR 20 (loan to John)
  Transport           DR 15 (taxi Paul)
  Bank                CR 40

Note that your bank statement is from the banks perspective. When you put money into the bank it is a CR because it is increase in the banks liability.

Is this too hard to model? I suggest not?

Ferret7446 · 2024-03-02T09:47:05 1709372825

I think you think it's confusing only because you're not used to it. The two models are mathematically equivalent.

The rules are unambiguous. Values which indicate capital are positive. Values which indicate ownership are negative. An increase in capital corresponds to an increase in ownership of said capital, which add up to zero. Perfectly balanced, as bookkeeping should be.

I find CR DR to be more confusing than +/-, because the latter works out naturally with basic arithmetic, whereas the former requires this arbitrary DR CR crud, which are made up terms that literally mean nothing and you have to rote memorize how they apply to different accounts (and to add insult to injury, the terms often have the exact opposite meaning to consumers due to how banks present the terminology, which means the terms literally can mean either thing depending on the context).

foota · 2024-03-02T10:53:46 1709376826

I think where this would get confusing is that a positive number would have to decrease some accounts, but increase others, right?

jeltz · 2024-03-02T12:53:26 1709384006

It does not because if you use + and - liabilities are negative and assets positive and if you add all balances the sum is zero. Actually the commercial accounting software I use works like that, it uses negative balances for liability accounts.

naasking · 2024-03-02T13:52:12 1709387532

Exactly, it's completely unnecessary extra information that doesn't need to appear in the core data model that the user never directly sees. Extracting credits, debits, reversals, returns, etc. is all higher-level information.

sam_lowry_ · 2024-03-02T09:34:30 1709372070

Also, money is usually kept as integer of cents. It is clear, inambiguous and allows for other rounding rules that sometimes pop up in legislation.

olivermuty · 2024-03-02T11:51:27 1709380287

Money exists outside us dollar and should be stored ideally a a composite of currency and decimal or just a decimal.

Then use currency specific rounding rules and always defer to CLDR [0] when displaying.

1. https://cldr.unicode.org/translation/number-currency-formats...

jojobas · 2024-03-02T12:14:31 1709381671

There is always a main currency for any entity, so any foreign currency dealings or having a reporting currency adds some fun in the form of changing exchange rates, realized/unrealized exchange rate differences, rounding errors and other shenanigans.

mrkeen · 2024-03-02T12:25:16 1709382316

> Is this too hard to model? I suggest not?

I'm sure I could put my mind to following those rules, but what's to be gained?

If 'assets' and 'income' are opposites, and 'income' and 'liabilities' are the same, then I'm just not going to bother. What if my income is paid in assets in the form of $1 bills? CR or DR?

And 'equity', Jesus is that a nebulous concept.

FredPret · 2024-03-02T16:40:10 1709397610

Equity is the most important of all - it’s simply what the business is worth to the owners if you sold everything

jojobas · 2024-03-02T12:23:36 1709382216

Then you get to corrective transactions, where a normal return is

  Sales DR 100
  Bank CR 100

while an annulled sales invoice is

  Sales CR -100
  Bank DR -100

The point of using debit/credit vs +/- is just that, the ability to have correct account turnovers with corrections, and maybe some conventional understanding of what's an expense account in debit and what's a vendor account in credit.

naasking · 2024-03-02T13:32:37 1709386357

A return and an annulled sale are still credits and debits, respectively, when it comes to the mechanical sum being performed. The specific category of the action is higher-level information that doesn't need to appear at this level of the model.

memset · 2024-03-02T14:00:53 1709388053

In my scheme, 1 always means DR and -1 always means CR. I use this, rather than the strings DR/CR, so the database can more easily do arithmetic to calculate balances. In order to do that arithmetic, we multiply the account's normal (again, 1 or -1) by the direction (1, -1) to determine if the amount represents an increase or decrease.

memset · 2024-03-02T14:13:47 1709388827

Did not expect this to hit the front page! A few notes:

- 1 always represents debit and -1 always represents credit. Regardless of whether it is an asset or liability account.

- Each account has what is called a "normal balance". This tells us whether you increase the value of the account using debits or credits. This is also modeled as 1 or -1.

- Amounts are always positive

- To find the actual dollar value for a transaction, you can multiply: amount * direction * normal. If the direction and normal are the same sign, then that means we increase the value. Otherwise we decrease. Your code doesn't need to know which accounts are assets vs liability vs equity in this scheme.

- Of course, you still need to know whether to insert transactions as debit or credit, and you need to know the normal balance of your accounts. But once you figure that out (by googling, or talking to your finance team) then the data model takes care of the rest.

hodgesrm · 2024-03-03T20:59:03 1709499543

Thank you for the post. One question: do you think that ClickHouse is a good database for solving this problem? ClickHouse does not guarantee ACID properties like isolation or consistency at levels above input blocks. It also uses eventually consistent replication between table replicas. So it's quite possible the same query may return different results depending on which server it runs.

I work on ClickHouse and am a big fan. That said, this consistency issue has arisen from time to time in customer cases when dealing with financial data. I'm curious if you have seen it in this case.

dcuthbertson · 2024-03-02T14:00:34 1709388034

Tangentially related, I know little about double-entry accounting and was wondering what CR and DR meant. If CR was credit, then what was DR? It's somewhat obviously debit, but why DR and not DB or some other abbreviation that uses letters in "debit"? I found a beautiful answer here [0] in the Accounting Historians Journal.

[0]: https://egrove.olemiss.edu/aah_journal/vol13/iss2/12/

Edit: I should add that CR and DR at one point in time were related to creditor and debitor. I referenced that article, because it's the journey of the author to find an answer that I enjoyed about it.

hum3hum3 · 2024-03-02T09:46:28 1709372788

I think the transactions table is misnamed. This is a table of posting legs. Which all must balance for a single transaction. I don't think you should get hung up on +ve-ve versus cr and Dr. When double entry was formalised negative numbers were not common and then thought errant. It is only in the eighteenth century it became normal. I enjoyed the entry. Note that you might you use different minimum fractions eg we use 0.00001p as a discrete value.

memset · 2024-03-02T14:05:47 1709388347

Legs might be better name. Line items?

I think a lot of people are glossing over the fact that you not only need 1/-1 for DR/CR, but also the "normal" balance of the account to tabulate balances.

sverhagen · 2024-03-02T11:56:26 1709380586

How many legs are there in a transaction?

lelanthran · 2024-03-02T19:35:59 1709408159

> How many legs are there in a transaction?

Almost always multiple, depending on what you consider a transaction, and the PoV of the accounts involved, etc.

I cannot think of a transaction that has a single leg only.

Here's a possibility: You buy something online. You pay $100 by credit card ($80 cost plus $20 shipping).

The accounting system needs to record your purchase. Maybe dr 'Mr ClientName' with $100, cr 'Clearing' with $100.

At some point bank recon will happen, the payment will reflect and the amount will be considered 'cleared'. At this point dr 'Clearing' with $100, cr 'depot' with $100 and delivery details.

The depot will cr 'Stock' for $80 worth of stock, dr 'Courier' for $20[1] using the order number and the companys client number at the courier for folio. Payment request fired off to people with authority to approve payments (PWATAP).

PWATAP will place order at supplier for $80 of stock (to replenish), dr 'PurchaseInProgress' with $80, pay courier service and cr 'Courier' with $20.

In reality, all of those things either happen in a more streamlined way (a holding account is used and replenishments are purchased periodically in batches, not individually), or behind the scenes and invisible to the end-user (dispatcher simply puts "fulfilled order" into the system, and the system will split it up however it needs to).

[1] The courier will go through the same process, for their services rendered

hum3hum3 · 2024-03-02T14:00:31 1709388031

1 for a zero amount more realistically commonly 2. End of year may have one for each account do many.

iKororo · 2024-03-02T11:52:46 1709380366

Feels more as a way to introduce a product (scratch data), rather than a technical article. If you have only one entity to manage accounts for - why would you need to build an accounting engine? In other case, when you manage accounts for multiple entities, such as different "clients" or "branches" - you have millions of transactions, and a query such as "get accounts for this client" will take a long time - and you still need to introduce some kind of materialized views.

memset · 2024-03-02T14:04:01 1709388241

Two perspectives on this:

- I believe this is a good application of columnar stores (ie, Clickhouse) rather than traditional Postgres, where handling 1M transactions is really fast.

- You might still choose to use materialized views! This article is a suggestion for a table structure that powers those views.

BirderO44 · 2024-03-02T11:18:38 1709378318

This confirms what I’ve heard repeated before: it’s easier to teach programming to an accountant than accounting to a programmer.

krackers · 2024-03-02T20:49:04 1709412544

Accounting is actually not hard, what's hard is the fact that accounting as a field didn't adapt to the existence of negative numbers so they still talk about the effects of "credits" and "debits" based on underlying account type.

Physics may have gotten the direction of current flow wrong, but at least they were _consistent_ and didn't start talking about "source voltage" and "load voltage" which would have muddled KVL into `sum{delta_sources} = sum{delta_loads}` compared to `sum{delta_voltage} = 0`.

The real issue is that besides the famous Kleppmann article, it's hard to find material on accounting that sheds this legacy baggage.

UnserMannInK · 2024-03-03T13:35:13 1709472913

From my experience as a programmer who did accounting for a while: Accounting is the easy part, but the borders to taxes are nonexistent (at least here in Germany). So when you do accounting you always do taxes to a certain degree. And this is not something that can be understood with logic, it’s completely arbitrary.

On software that was written by accountants: look no further than DATEV. It’s an abomination I would not inflict on my worst enemy.

naasking · 2024-03-02T13:17:47 1709385467

Plenty of articles of programmers talking about accounting to form one side of that opinion, where are the articles from accountants talking about programming to inform the other side?

throwaway4good · 2024-03-02T14:38:07 1709390287

How is this any difference with using the sign of the amount column to signify debit or credit? Using that approach at least you save a column. Otherwise have explicit debit and credit columns if you want to be "consistent with how accounting is done".

greatgib · 2024-03-02T11:26:08 1709378768

I don't really see how what is described is "scalable".

With hundred thousandth of transactions, performing the sums will be expensive.

And nothing ensure the safeness of the order of entries...

sverhagen · 2024-03-02T11:51:56 1709380316

I think they were suggesting they chose "a column-oriented database like Clickhouse" to solve the summing problem for them, with some magic under the hood, maybe doing aggregates, as was mentioned separately as a solution if you wanted to roll your own.

azlev · 2024-03-02T10:57:21 1709377041

I didn't get the date as text and the amount as real in table transactions.

FrancoisBosun · 2024-03-02T12:58:53 1709384333

SQLite, as used in the article, has a very simple data model: 64 bit ints, 64 bit floats, and textual data. If you used a different database engine, you would probably use the datatypes it offers instead.

memset · 2024-03-02T14:06:56 1709388416

Agree with this! I just used SQLite as kind of a pseudocode for demonstrating the concept.

azlev · 2024-03-02T15:01:52 1709391712

Ah, thank you. I understood that it was a Clickhouae schema, it's not.

ponywombat · 2024-03-02T11:34:13 1709379253

Why would you want to build your own accounting ledger from scratch? Accounting is a completely new domain for most engineers, and TigerBeetle (https://tigerbeetle.com/) already solves this problem.

olivermuty · 2024-03-02T11:54:29 1709380469

I think our definition of «solved» might be slightly different hehe.

«The production release of TigerBeetle is imminent.»

Not putting money into this system any time soon in other words.