Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'd like to see someone have a decent go at reexamining the idea that business logic (BL) doesn't belong in the DB and tease apart exactly what that should mean, if it's actually still true, or even if it's ever actually been true. Maybe something like that's been posted here before, but I haven't seen it...

I bought into the idea myself for a while, but when I interrogate my belief there it was just something I picked up at uni as part of the general 3-tier approach, and I'm not sure how much of it is really practically grounded. I see the same ideas held dogmatically by newer employees, and when I ask them about it in detail, I see the same fuzziness.

After all, if you want to really be pedantic, then you could claim anything beyond having a single table with two columns, "key" and "value" is pushing BL into the DB.

I.e, what practically does the separation gain you? You may want to swap out the DB in future? Almost never happens, not without some significant other refactoring going on (e.g breaking out into separate DBs because the product's grown to the point of needing distributing across services, etc). If you want to really design with that in mind, you're almost certainly giving up a lot of functionality that the particular DB is going to give you. It's on the level of holding onto the possibility that you'll want to change the language you're writing in at some stage (and that event would favour pushing more into the DB anyway).

In reality, what I've found is when there's a reliable, transactional, relational datastore at the bottom (i.e we're not talking about a larger distributed system), then for your own sanity you want to be pushing as much as possible down the stack as possible, to reduce the surface area that someone could reach through to change data in the wrong way. Data consistency issues are some of the worst when it comes to eroding customer trust, and if your data access/mutation paths are many and varied then you can do your head in or burn through a lot of dev morale trying to diagnose them.

The strongest advocates otherwise I've found are those in office cultures where there's little trust between devs and DBAs and getting things through DB review are a rigamorale that end up driving devs towards doing as much in code as possible. I've always suspected that beyond Google envy, these sort of dynamics are what drove a lot of the initial NoSQL movement...




Thanks, the Derek Sivers piece rings a bell now (and maybe was going into my thinking on this), and the other two will go into my notes to avoid having to repeat myself the next time this one inevitably comes up on the office Slack...


I worked on one system that did intentionally put business logic almost entirely in the DB. The VP was a former database guy and the primary justification was auditing. It worked well enough in practice.

There’s a quote from Gavin King somewhere exhorting developers that ORMs and raw sql are meant to be used concurrently, ORM for CRUD ops,sql for most everything else.

The problem I have with ORMs is lack of paging on collections; there’s always that one user that pulls in half the database.


In my experience, keeping BL out of the DB is a practical concern, not a dogmatic one.

Usually there are two things:

1. Performance. Business logic is usually horizontally scalable, databases are usually not. You want to pull as much BL out of the database as possible, and put it into stateless microservices so they can be scaled.

2. Version control and deploying changes to business logic. Business logic changes frequently. Do you want to have to be making that frequent of changes to your database? Do you have practices around safe deployment and code review for stored procedures?

I've worked at a place where there was lots of business logic (millions of LOC) stored "in the database". In this case the database was MUMPS. It can be done, and it can be pleasant. The catch is that vertical scaling is the only option, and you have to spend years building your own tooling from scratch.

I think the line between database and code is destined to become even more blurred, but not by bringing the code down into the database, but by lifting more and more database concepts up into "application space" as distributed computing becomes more normalized.


Yeah the performance point is interesting, and I think it has merit in a well-defined scope. The idea of bringing DB concepts up is something I've seen done, but it wasn't a pleasant experience overall and I think everyone involved is regretful that we went down that path. It's a good story though. Ramble starts:

The system was an enterprise-focused offering with weak product management, so in the early growth-oriented days of the company ended up saying "yes" to a lot of feature requests that we probably shouldn't have. Where this particularly impacted was the permissions system, which evolved over time from an ACL with odd warts to also include a bolted on hierarchy of users and their objects, and then a role based system, and then all sorts of enterprise-level oddities like cross-links in the hierarchy, shared subhierarchies, roles that could appear at multiple hierarchy points, virtual ACLs that behaved slightly differently etc. So potentially a large number of different access paths between a given user and resource.

Years ago, when this mess was starting to expand it was decided it was too hard to model this in the DB (and really that should've been a giant red flag), so the new approach was to load each customer's dataset up into an application server that would handle the BL, crucially including the permission logic. It very much wasn't stateless, as you mention, but I'm not sure how it could've been really, given you needed so much loaded up from the DB in order to make these permission decisions. Would've avoided a lot of headache if it had...

The consequence though of using this write-through in-memory cache was it became the source of truth.

The chief problem this led to was that shift into application land was a bell that couldn't be unrung. Everything others in this thread have complained about seeing BL spread across all sorts of SPs in a DB happened here, just in application code (which again hints I guess that the chief problem is architectural and lack of governance, not a wrong-layer problem). Nobody could properly describe the permissions system in response to support requests without a lot of code scouring, let alone hold it in their heads.

Even worse, as the application grew in size and needs, we found we still needed things we had left behind in the DB. A couple of smart devs working on the core service, because they were smart and trusted, convinced themselves that what we needed was an in-memory transaction system for managing this in-server cache (by now the core was being called the Object Cache, a name so generic that it also should've been a red flag). So a couple of years went into implementing a technically impressive reimplementation of a transaction system.

Meanwhile the system as a whole was well past the point of being needed to split up into multiple services, so a grand goal was set of moving the Object Cache into a standalone service: the Object Service. Slap an OData API on it, and then leave that API open for all internal teams to hit. By this point the core team who owned this was starting to become well aware they had fallen into a bad pit of reimplementing the Postgres DB everything still sat on: transactions, generic query API, configurable in-memory indexes for each type of object, partition loading logic for the timeseries sets, user permissions etc. Worse, the generic query API (and this is what's ultimately turned me off OData/GraphQL etc for cross-team interfaces) ran into all the same problems as an SQL interface - people in other teams would always be coming up with brand new queries your indexes hadn't anticipated, forcing new work on the keepers of the cache to support.

The way forward probably would've been to leave the permissions structure in place and pull as much as possible out of the service/cache into separate stanadlone services, i.e leave the objects in the object cache little more than just IDs you could look up elsewhere for actual attributes. We'll never really know though: it was recently decided to put the whole thing into maintenance mode, partially because nobody has the political will to fix the thing and its complexity.

I've thought a lot about the lessons of this system and where it went wrong, and I trace it ultimately to forging ahead with a permissions system (i.e core BL) that couldn't be modelled in the DB. I think that doomed us to what I'd ultimately name second system syndrome (i.e the whole stack above slowly evolved into a poor Postgres clone). Perhaps if we had been ruthless in going to a CQRS design or similar very early on we could've pulled it off (which would again demand stateless services above) we could've pulled it off.

I think a big takeway for me was to not take for granted the good things a RDBMS is giving you, especially transactionality. I think I'd bias these days towards looking for ways to scale up the powers the DB gives you (even if it involves pushing external integrations into the DB maybe via some funny FDW or transaction log integrations, or maybe using a tiered solution of DBs, one layer as the storage layer and another layer above that acting as coherent transctors, a little bit similar to Datomic's internal structure), rather than rushing toward replicating core DB functionality in application code, which can be tempting initially as the easier-looking solutions.

I think I might blog about this when the system's properly retired and the involved can look back and laugh rather than cringe...


As a rule, relational database systems have two strong areas: parallel processing and data consistency enforcement. I've always understood "don't do business logic in the database" as meaning not to use the database engine for sequential processing.

If you have a single-user database system (e.g. webapp with database backend), there is relatively little to gain implementing data validity checking in the database, other than the declarative statements versus imperative rules discussion (which can already be a huge benefit, depending on the team).

But once you have a central database with multiple frontends (common in enterprise ERP solutions), enforcing data consistency in the backend becomes pretty much unavoidable -- otherwise a single bug or new feature in one frontend could disable entire production lines.

when there's a reliable, transactional, relational datastore at the bottom [..] you want to be pushing as much as possible down the stack, to reduce the surface area that someone could [..] change data in the wrong way

Yes, this. Very much this.


I'm working on a project that is almost literally your example of reducing a relational database to a key value store, despite client applications having complex problem domains to model.

The consequence has been catastrophic data consistency problems and the whole programme rapidly grinding to a halt.

I think the problem is the overly broad definition of 'business logic' that encompasses everything from data integrity to presentation logic.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: