This claims 1 and -1 is confusing, then proceeds to use the confusion. Why not use the language of the problem domain?
The rules are unambiguous. An increase in assets are a DR, increase in liabilities a CR, in income a CR, in expenses a DR, in owners equity a CR.
Decreases the opposite.
Using 1 and -1 is nonsensical because their meaning in mathematics is fixed, while as shown by the rules above, in accounting DR/CR meanings change.
So a couple of example journal entries…
Product sale for cash
Sales CR 100
Bank DR 100
Payment of rent
Rent DR 50
Bank CR 50
Sale for cash where tax collected and owed
Income CR 70
Tax Payable CR 30
Bank DR 100
Monthly petty cash tin entry
Postage DR 5
Accounts receivable DR 20 (loan to John)
Transport DR 15 (taxi Paul)
Bank CR 40
Note that your bank statement is from the banks perspective. When you put money into the bank it is a CR because it is increase in the banks liability.
I think you think it's confusing only because you're not used to it. The two models are mathematically equivalent.
The rules are unambiguous. Values which indicate capital are positive. Values which indicate ownership are negative. An increase in capital corresponds to an increase in ownership of said capital, which add up to zero. Perfectly balanced, as bookkeeping should be.
I find CR DR to be more confusing than +/-, because the latter works out naturally with basic arithmetic, whereas the former requires this arbitrary DR CR crud, which are made up terms that literally mean nothing and you have to rote memorize how they apply to different accounts (and to add insult to injury, the terms often have the exact opposite meaning to consumers due to how banks present the terminology, which means the terms literally can mean either thing depending on the context).
It does not because if you use + and - liabilities are negative and assets positive and if you add all balances the sum is zero. Actually the commercial accounting software I use works like that, it uses negative balances for liability accounts.
Exactly, it's completely unnecessary extra information that doesn't need to appear in the core data model that the user never directly sees. Extracting credits, debits, reversals, returns, etc. is all higher-level information.
There is always a main currency for any entity, so any foreign currency dealings or having a reporting currency adds some fun in the form of changing exchange rates, realized/unrealized exchange rate differences, rounding errors and other shenanigans.
I'm sure I could put my mind to following those rules, but what's to be gained?
If 'assets' and 'income' are opposites, and 'income' and 'liabilities' are the same, then I'm just not going to bother. What if my income is paid in assets in the form of $1 bills? CR or DR?
Then you get to corrective transactions, where a normal return is
Sales DR 100
Bank CR 100
while an annulled sales invoice is
Sales CR -100
Bank DR -100
The point of using debit/credit vs +/- is just that, the ability to have correct account turnovers with corrections, and maybe some conventional understanding of what's an expense account in debit and what's a vendor account in credit.
A return and an annulled sale are still credits and debits, respectively, when it comes to the mechanical sum being performed. The specific category of the action is higher-level information that doesn't need to appear at this level of the model.
In my scheme, 1 always means DR and -1 always means CR. I use this, rather than the strings DR/CR, so the database can more easily do arithmetic to calculate balances. In order to do that arithmetic, we multiply the account's normal (again, 1 or -1) by the direction (1, -1) to determine if the amount represents an increase or decrease.
Did not expect this to hit the front page! A few notes:
- 1 always represents debit and -1 always represents credit. Regardless of whether it is an asset or liability account.
- Each account has what is called a "normal balance". This tells us whether you increase the value of the account using debits or credits. This is also modeled as 1 or -1.
- Amounts are always positive
- To find the actual dollar value for a transaction, you can multiply: amount * direction * normal. If the direction and normal are the same sign, then that means we increase the value. Otherwise we decrease. Your code doesn't need to know which accounts are assets vs liability vs equity in this scheme.
- Of course, you still need to know whether to insert transactions as debit or credit, and you need to know the normal balance of your accounts. But once you figure that out (by googling, or talking to your finance team) then the data model takes care of the rest.
Thank you for the post. One question: do you think that ClickHouse is a good database for solving this problem? ClickHouse does not guarantee ACID properties like isolation or consistency at levels above input blocks. It also uses eventually consistent replication between table replicas. So it's quite possible the same query may return different results depending on which server it runs.
I work on ClickHouse and am a big fan. That said, this consistency issue has arisen from time to time in customer cases when dealing with financial data. I'm curious if you have seen it in this case.
Tangentially related, I know little about double-entry accounting and was wondering what CR and DR meant. If CR was credit, then what was DR? It's somewhat obviously debit, but why DR and not DB or some other abbreviation that uses letters in "debit"? I found a beautiful answer here [0] in the Accounting Historians Journal.
Edit: I should add that CR and DR at one point in time were related to creditor and debitor. I referenced that article, because it's the journey of the author to find an answer that I enjoyed about it.
I think the transactions table is misnamed. This is a table of posting legs. Which all must balance for a single transaction.
I don't think you should get hung up on +ve-ve versus cr and Dr. When double entry was formalised negative numbers were not common and then thought errant. It is only in the eighteenth century it became normal.
I enjoyed the entry. Note that you might you use different minimum fractions eg we use 0.00001p as a discrete value.
I think a lot of people are glossing over the fact that you not only need 1/-1 for DR/CR, but also the "normal" balance of the account to tabulate balances.
Almost always multiple, depending on what you consider a transaction, and the PoV of the accounts involved, etc.
I cannot think of a transaction that has a single leg only.
Here's a possibility: You buy something online. You pay $100 by credit card ($80 cost plus $20 shipping).
The accounting system needs to record your purchase. Maybe dr 'Mr ClientName' with $100, cr 'Clearing' with $100.
At some point bank recon will happen, the payment will reflect and the amount will be considered 'cleared'. At this point dr 'Clearing' with $100, cr 'depot' with $100 and delivery details.
The depot will cr 'Stock' for $80 worth of stock, dr 'Courier' for $20[1] using the order number and the companys client number at the courier for folio. Payment request fired off to people with authority to approve payments (PWATAP).
PWATAP will place order at supplier for $80 of stock (to replenish), dr 'PurchaseInProgress' with $80, pay courier service and cr 'Courier' with $20.
In reality, all of those things either happen in a more streamlined way (a holding account is used and replenishments are purchased periodically in batches, not individually), or behind the scenes and invisible to the end-user (dispatcher simply puts "fulfilled order" into the system, and the system will split it up however it needs to).
[1] The courier will go through the same process, for their services rendered
Feels more as a way to introduce a product (scratch data), rather than a technical article. If you have only one entity to manage accounts for - why would you need to build an accounting engine? In other case, when you manage accounts for multiple entities, such as different "clients" or "branches" - you have millions of transactions, and a query such as "get accounts for this client" will take a long time - and you still need to introduce some kind of materialized views.
- I believe this is a good application of columnar stores (ie, Clickhouse) rather than traditional Postgres, where handling 1M transactions is really fast.
- You might still choose to use materialized views! This article is a suggestion for a table structure that powers those views.
Accounting is actually not hard, what's hard is the fact that accounting as a field didn't adapt to the existence of negative numbers so they still talk about the effects of "credits" and "debits" based on underlying account type.
Physics may have gotten the direction of current flow wrong, but at least they were _consistent_ and didn't start talking about "source voltage" and "load voltage" which would have muddled KVL into `sum{delta_sources} = sum{delta_loads}` compared to `sum{delta_voltage} = 0`.
The real issue is that besides the famous Kleppmann article, it's hard to find material on accounting that sheds this legacy baggage.
From my experience as a programmer who did accounting for a while: Accounting is the easy part, but the borders to taxes are nonexistent (at least here in Germany). So when you do accounting you always do taxes to a certain degree. And this is not something that can be understood with logic, it’s completely arbitrary.
On software that was written by accountants: look no further than DATEV. It’s an abomination I would not inflict on my worst enemy.
Plenty of articles of programmers talking about accounting to form one side of that opinion, where are the articles from accountants talking about programming to inform the other side?
How is this any difference with using the sign of the amount column to signify debit or credit? Using that approach at least you save a column. Otherwise have explicit debit and credit columns if you want to be "consistent with how accounting is done".
I think they were suggesting they chose "a column-oriented database like Clickhouse" to solve the summing problem for them, with some magic under the hood, maybe doing aggregates, as was mentioned separately as a solution if you wanted to roll your own.
SQLite, as used in the article, has a very simple data model: 64 bit ints, 64 bit floats, and textual data. If you used a different database engine, you would probably use the datatypes it offers instead.
Why would you want to build your own accounting ledger from scratch? Accounting is a completely new domain for most engineers, and TigerBeetle (https://tigerbeetle.com/) already solves this problem.
The rules are unambiguous. An increase in assets are a DR, increase in liabilities a CR, in income a CR, in expenses a DR, in owners equity a CR.
Decreases the opposite.
Using 1 and -1 is nonsensical because their meaning in mathematics is fixed, while as shown by the rules above, in accounting DR/CR meanings change.
So a couple of example journal entries…
Note that your bank statement is from the banks perspective. When you put money into the bank it is a CR because it is increase in the banks liability.Is this too hard to model? I suggest not?