Hacker News new | past | comments | ask | show | jobs | submit login
Taming the beast that is the Django ORM – An introduction (davidhang.com)
143 points by AbundantSalmon 4 months ago | hide | past | favorite | 140 comments



I think Django's ORM is just AMAZING. And as with every other tool, it has to be used wisely.

First, it let's you get started quickly and prototype. You can write unit tests, make sure everything is working as expected, then count queries and make sure you're being efficient with your SQL engine.

Second, and even more importantly, it's crucial in the definition of the app and the Schema. Thinking in high level "classes and objects" helps with the abstraction and the design of the apps. Even if you then default to raw SQU queries, thinking and building your model with class abstractions is huge.

Finally, there are some "tiny details" (but in my eyes, very important) that everybody oversees:

* Migrations: the way Django has designed migrations is just marvelous. We've migrated tons of data and changed the structure of our DB multiple times without any issues.

* Troubleshooting and "reporting": the ability to fire a quick shell, load a few models in Django's REPL, and do a few `.filters()` is for me key. On top of that, we add a Jupyter server connected to our read replica and we can do all sorts of reporting VERY QUICKLY. Not everybody needs a Data Lake :)

PS: We've ran Django sites accessed by millions of people per month. And we never had an issue with the ORM. Yes, sometimes I have to tell more junior devs that the gymnastics they're doing with `select_related` and `prefetch_related` can be more easily and effectively resolved with a query. But that's it. I'd say less than 1% of all the queries in our codebase have to be migrated to raw SQL.


One common pitfall with Django ORM's is that it makes it very easy the use of inheritance in models. But as we know, the "impedance mismatch" between OOP and the Relational model is a problem.

It has happened that a dev in our team was populating an endpoint that used inheritance and when looking at the number of queries we were over 100.

But the solution in those cases is just OUTER JOINs and CASE. Especially since we use Postgres in the backend and it works so well.

So yes, there are some pitfalls, but they're greatly overshadowed by the advantages of the ORM.


You have to use the ORM with a bit of awareness of the query it it generates. Django provides plenty of tools and great documentation for reducing, combining and managing queries. It's not like it's a deep dark corner.

It's like understanding memory allocation if you're writing c. It's just part of the job


We've had serious trouble with migration merges when two people work on different parts of the code, yet in the same module, and they both generate a "migration nr. 6" on their feature branches.


Not at my desk but from memory, this is a something with a clearly documented "solution".

I put that word in quotes because I don't want to imply it's a problem - it's just something you need to know how to handle.


That's the problem I have with Python mentality. Everything is great, you just need to know about the myriad of pitfalls and problems. A good environment makes problems obvious and and allows to communicate decisions clearly.

This is the trap of the local maximum.


> ..with Python mentality

As opposed to what mentality ? I mean any tool I worked with has the same problem.


Not that, this is the problem they're referring to:

> > Everything is great, you just

Part of the python mentality is to pretend difficulties aren't there in order to keep presenting itself as beginner-friendly.


Something in the official docs or more like shared through blogs or stackoverflow ?


It's in the standard docs:

https://docs.djangoproject.com/en/5.1/topics/migrations/#ver...

You just rename one of the two files, and add the dependency. If they touch the same fields, you obviously have to resolve that manually.


Thanks a lot


No problem!


May I ask, what experience do you have with other ORMs?


There is one part in me that says to not destroy your positive mood, there is another part in me that wishes to yell at the pythonistas in general to look outside the Python world.

Too often I come across Python projects that are hyped and when I dive into it I find it rather underwhelming to say it politely. Invariably it turns out that those people don´t know any other language than Python.

I see that as a general problem; Python is the language for beginners this day with an endless amount of tutorials, but it seems lots of those starters don´t get to have a look outside the Python world. I fear this is getting an even bigger problem because AI models are trained on a vast range of Python, not because it is better, but simply because it is the PHP of these days.

I don´t mean to imply that I know you have that "narrow profile".

I agree that the value in Python is for quick prototypes in case you know Python well. Outside of that (at the risk of a language war), I think one does better look into modern .net core or the JVM for a general purpose, high quality and highly efficient language/platform.

On the topic of ORM I think EF Core (.net core) and Doctrine (php) is strictly better.


ORMs are a bad idea in the first place: relations are a great way to organise your data, why would you want to sully that by converting to something object oriented?

> Too often I come across Python projects that are hyped and when I dive into it I find it rather underwhelming to say it politely. Invariably it turns out that those people don´t know any other language than Python.

To give a nice counterexample: Python's hypothesis library is really great, and compares favourably to its Haskell inspiration of QuickCheck.


They’re incredibly convenient. It allows smaller groups of people to accomplish more in a shorter amount of time by providing a standardized interface that does a great job of integrating with their existing environment. When translating database input and output into server-side representations, you have two options: build and maintain the process yourself or use a mature tool designed for that specific purpose in your server-side language. I strongly prefer the latter, as do many others. I find that most people critical of ORMs end up creating their own narrowly focused, weakly tested ORM-like database compatibility layers in their backend. These have caused more problems than ORMs ever have.


> When translating database input and output into server-side representations, [...]

I'm saying that your server side representation could also be done as a relation. No need for object orientation. I don't have a problem with the M part of ORM, but with the O part.

_If_ you are having different representations, than having an automated mapper between representations is good. Agreed, yes.


A more appropriate name for ORMs would be "network data model to SQL mappers". They don't facilitate good OOP and oppose relational thinking and data management.


> ORMs are a bad idea in the first place: relations are a great way to organise your data, why would you want to sully that by converting to something object oriented?

One big why is SQL. It's a horrible language/API but sadly practically all relational databases are SQL.


This is the nail on the head.

I’m really not a fan of ORMs, I don’t think they save any time because to use them in a safe way, you NEED to fully understand the SQL that will be run. There’s no shortcut from this and it’s plainly just an extra step.

I’ve been bitten too many times but one ironic problem is that they’re so damned reliable. 99.999% of the time your use of an ORM will be just fine. It can be hard to argue against sometimes even though that 0.001 time can be an extinction level event for the company.

So if you need to understand the SQL that will be run anyway and ORMs occasionally introduce disastrous behaviour, why persist with them?

Well to start with the obvious, SQL is not panacea, you can still be bitten with non-deterministic behaviour when you hand craft every query in sql (e.g. when was the last time you updated stats on that relation - will the optimizer select an appropriate strategy?).

But a stronger reason is that some harder queries just suck to write correctly in SQL. Maybe you’re working with some kind of tree or graph and need a recursive common table expression: Enjoy! It might be better in most cases to write such a query at a higher level of abstraction (ORM) and take advantage of the more productive testing facilities.


I agree that SQL ain't great.

You can model relations in your language without using SQL. (You could have a mapper from your internal modelling of joins etc to how you talk to your database.)

In eg Haskell that works really well, but other languages are also flexible enough.


So horrible that it’s been the standard for 50+ years.

The relational model is great. Embrace it, because it isn’t going anywhere.


Relational model is good, SQL is not. They are not the same thing.


Short of academic languages like D, nothing faithfully implements the relational model. Some compromises are necessary to make it useful for practical applications, hence SQL.


Have you looked at Datalog recently?

SQL has other warts, apart from not being completely relational.


For pythonistas reading this: JVM and .NET frameworks are better for large projects in the sense that they turn even the simplest projects into large messes of pointless byzantine buggy architecture, which in some circles is a sign of "high quality".

Edit: That said, learning multiple programming languages (paradigms) is a very good idea. A reasonable dose of e.g. Java, .NET or C++ for example is great for understanding how programming languages and software should not be written. Learning modern JavaScript makes it obvious how Python really suffers from weak functional programming support, very bad performance and a horrible packaging system.


> For pythonistas reading this: JVM and .NET frameworks are better for large projects in the sense that they turn even the simplest projects into large messes of pointless byzantine buggy architecture, which in some circles is a sign of "high quality".

You got to back up this claim. The technologies you mentioned are entirely orthogonal to the architecture you are using.



This is not even a counter argument. We are talking ORMs here, not dependency injection frameworks.

The Java equivalent to Django ORM would be Hibernate.

It doesn't strike me that you know what Spring does, nor where its own features end.


The comment I replied to was to general development. And Spring is the most popular Java web framework (that it's called a "dependency injection framework" is kinda telling about the architectural ideology).

I've had the misfortune to use both Spring and Hibernate professionally.


I used Java, python, and golang professionally, and the Java ecosystem (and yes, including the language) is far ahead of what the others offer on almost all important fronts, including readability, expressiveness, testability, introspection, etc. You just need to be a little bit disciplined - then again, that is always the case regardless of language/platform.


Wow, I didn't think people still thought like this in 2024.

The widespread use of Python is a problem? People should learn java? Okay.

I'll note you don't talk about the usefulness of the projects themselves, only your opinion of what the code "looks like".


How are those better?


Statically typed languages, performance, refactorability, instrospection, debuggability, observability, the list goes on.


job security for one big one


no, i'm not about to go back to .net core


I adore the Django ORM but as listed under cons... it makes it very hard to avoid accidental N+1 queries, and they don't seem interested in addressing this (https://code.djangoproject.com/ticket/30874). Yes lazy loading is neat when you're messing around on the shell, but you should absolutely not be leaning on it in production at any kind of scale. So instead you have to use unit tests to hopefully catch any N+1 queries.


Just to say there are libraries to help you find n+1 queries too (when your code is running).

I use https://pypi.org/project/django-nplusone/ for instance. Sentry also warns of these by the way.


> they don't seem interested in addressing this (https://code.djangoproject.com/ticket/30874).

Actually it seems they are: https://code.djangoproject.com/ticket/22492#comment:11


It is a huge red flag to me when people recommend Django ORM without admitting the pitfalls of the Active Record pattern. This is a problem which simlpy doesn't need to exist and it's wasting computing and dev resources time and time again.


What would be the best way?


If it were up to me there'd be a way to completely disable it globally for all models. Let me explicitly enable it when I'm just shelling around or checking results in unit tests. It is not a feature for production environments.


Laravel does this (or at least has a config for it). You can disable lazy loading in non-production environments which will throw fatal errors whenever a model is lazy loaded.

The nice part is, in prod, that code _allows_ lazy loading since application stability is more important. Hopefully at that point you have a good performance monitoring tool that will alert you to that problem. Laravel also has hooks to more granularly fire events about lazy loading if you want to roll your own notification solution.

So I disagree with your assertion. Lazy loading is ONLY for production. While it is possible that such a feature could potentially bring down your DB server and therefor your app in general, if the feature is turned off locally, then hopefully you’re catching it well ahead of time. And if you’re running small app with no users, meh. For large apps, hopefully your team’s standards are enough to keep the monsters at bay.

Lazy loading is a sharp knife that requires skill to use appropriately.


Hmm I would call lazy loading the opposite of a sharp knife that requires skill.. you're not being explicit about database work. You're relying on the framework to bail you out where you've forgotten a fetch. And by default Django will do that silently so missing prefetches easily go undiscovered.


Would auto-joins be better in production envs?


There's a Django app (that I forget the name of) that disables lazy loading in templates, thus requiring you to load everything explicitly in the view.


I think there's a few 3rd party solutions like https://github.com/charettes/django-seal but I don't love the idea of using something that I assume is monkey patching Django code.


There is an ongoing effort to include back the equivalent in core [here](https://github.com/django/django/pull/17554)


Static lint wouldn't be a bad way


Not really.

.prefetch_related (for whole models) and annotate/Subquery (for fields or aggregates) have existed for many years, alongside a pile of aggregate functions which have existed forever and have improved.

Whether or not you use the tools given to you is a sign of developer quality and experience. You can easily avoid n+1 99% of the time but you have to appreciate the problem exists.

I think the Django project demanding some competence is okay.


Not sure you understand my complaint because forcing the user to track down N+1 queries to know where they're missing .prefetch_related is the problem. I don't want to fix something 99% of the time. I want my ORM to enforce correctness 100% of the time.


ORMs do a great job at making easy things even easier and hard things a lot harder. If that sounds like a bad deal to you- it’s because it is!


The Django ORM makes migrating database state (a hard problem) super easy. It also makes GROUP BY queries (a relatively easy problem) oddly difficult. Use an ORM where it helps, use SQL where it doesn't.


I personally don’t think DB migrations are that difficult, and I’d say they can be done a lot safer without an ORM. I’ve also never found an ORM that was easier to learn and overall easier to use than SQL. I’ve never understood how ORMs became so popular, I don’t get how somebody can look at how opaque and complicated they are and conclude that learning an ORM is easier than just learning SQL.


100%. ORMs are good for basic queries but the messes I have seen written joining 20 plus tables with conditions and left joins, etc, oof. Just use SQL, it’s much cleaner and easier to maintain.


Because it's not about avoiding SQL (although that's a benefit), it's that you also get forms for free, view classes, easy APIs, validation, a free admin, free docs and a large pool of potential employees who all understand the foundations of your app.


You don’t get any of that for free. You get it at the cost of working with the abstractions, which can be very high. Your RDBMS is not an OOP/multi-paradigm system like Python is (or whatever other language). Its objects interact differently, and it has different access patterns. Any system you use to try and ignore these differences is just going to create a host of other problems for you, especially this idea that it will allow you to simply ignore database administration.


With LLMs I've managed to ignore these issues so far on my django app with 100+ views and 50+ models. So yeah, I get that for free. My app is more complex than a simple CRUD app, but still at least 90% CRUD.

The benefits far outweigh the disadvantages.


To be fair it sounds like LLMs are managing that for you. You’ve got two (mostly) black boxes stacked on top of one another here. You’ve avoided managing your DB and avoided managing your ORM. What will you do when something eventually goes wrong? Ask the LLM to fix it?


Yeah. Or apply my brain. Reviewing LLM-written code is no different to working with code from a colleague or library. It's the best of both worlds, certainly while iterating quickly on the MVP & early stages.


Why Django ORM is considered a beast? It is easiest ORM to date and very convenient API to work with. If you think Django ORM is a beast, try SQL alchemy


I used sqlalchemy once. I found it more complicated with way less functionality


As someone who's had numerous grand battles with both, I find that SQLAlchemy's overall design philosophy gives you more precise control over both the generated SQL, and the lifecycle of each query/transaction/connection; you get to choose exactly how much abstraction you need for the task, but it won't hide things from you where you MUST make a conscious decision. I find it's also a better choice of the two, if you're writing something that has absolutely NOTHING to do with web whatsoever (e.g. a DB-backed desktop app), as you don't have to drag in a whole kitchen sink just to drink a glass of water; but you may also want to investigate something lighter than SQLAlchemy for that use-case.

Meanwhile Django's ORM just does the job perfectly fine in the 99% common case, and for the 1%, again 99% of the time it has enough escape hatches to solve the hard problems without making things excessively complicated. But $DEITY save you if you're in that 0.01%...


For me the killer feature of django is the auto-generated admin UI. I initially started my last project using Spring boot, but I was astonished to find there was no equivalent.

I don't know how people build websites in any speed without such a tool. I guess they just waste time duplicating effort on an admin UI or pay for one of those tools that can generate one from an API (meaning they have to also build an API).

It's such a massive time-saver I switched to django after a week or so.


It can be quite difficult to do anything mildly advanced with the admin UI, IIRC

And it encourages CRUD thinking instead of thinking about business processes and user experience

It's great for tiny/personal projects but in an organisation it can be a trap IMO


I used to think this too. LLms make generating such UIs in other frameworks too much easier. Also the admin UI gets clumsy very fast and you’ll need to roll your own for a good UX.


It depends on what kind of app you’re building. The Django UI could be okay for your business use case, but for many businesses, it is not. If I have to write the entire admin UI for some specific business reasons/workflow, then I might as well dogfood those components for the needs of our dev team.


Most people seem to love it. I, on the other hand, tend to disable it from the start when creating a new Django project. I add shell_plus and I just use the Django REPL to explore the data (in addition to dbshell - psql in my case). That way I can type instead of having to navigate a UI.


I would say even more:

- the main reason to use django today is the admin UI

- and the only reason to use django ORM is the admin UI


Is there an admin ui which has finegrained horizontal and vertical permission depending on user? Django admin assumes admins have God-level control over all registered tables.


I don’t really understand this, I was very much underwhelmed by the admin UI, and feel like I’m missing something? I mean what does it do other than list records and provide some basic forms to CUD those?

It feels like much of the value in that is achieved even better by using a proper database client, say, JetBrains‘ database integration. I certainly wouldn’t let any business people touch a Django admin panel, or at least those I’ve seen built by our data guys. Way too technical.

And don’t even get me started on extensibility. I wanted to have like, an additional page to include a Grafana iframe plus a bunch of buttons. Something that would take me about five lines in Laravel, for example. Good luck even searching the documentation for this…


It excels when you're iterating on an MVP. Once a business is proven and you have the money to go back and put in the the gold-plated solutions, sure, write your own. But while you're building the MVP it's amazing. No need to faff with building a separate admin, even if it is just CRUD.

It means you can focus on your business logic while buying time to get you through the initial development.


Pretty decent introduction. Will there be additional parts that cover how to create GROUP BY queries in the ORM? I find even seasoned Django developers struggle with these.

Also, I believe your code for creating an empty "data" migration is missing the "makemigration" command itself.


> Will there be additional parts that cover how to create GROUP BY queries in the ORM? I find even seasoned Django developers struggle with these.

The last time I needed to do that, I ended up crying "uncle" and writing manual SQL. I wasn't happy about doing it, but I was happy that the framework left me an escape hatch so that I could.


It's actually easy once you get it, it's just that they choose bizarre terminology that doesn't translate directly from sql. .values() and .annotate() ?!


Good topic for a second blog post/skill share. Are there any other interesting topics you can think of? I think I know so much but in it too deep to remember what is unusual. These came up with the recent onboarding of some new devs.

Thanks, I will update the data migration code!


People focus too much on the query aspect of an ORM. Even though you can still write raw query strings in an ORM if you want to.

Routes, form validation, REST API, templating (if you don't need react), auth, etc.

You'll probably wind up recreating a lot of ORM and surrounding functionality at lower quality


Anytime I hear ORM, I always think of this: https://blog.codinghorror.com/object-relational-mapping-is-t...


I tried reading and understanding the arguments made here, but just could not make any correlation with my day to day experience using ORMs.

is it possible that the fact this article is written in 2006 simply makes it dated? it seems very catastrophizing but we've come a long way and are just more aware of how to work with or around the shortcomings and flaws of ORMs.

I've often written programs where I'm trying to encapsulate pure in memory state into business objects and run into the same type of issues people complain about with ORMs. programming is just hard, we don't have to be such dogmatists about it.


I probably should have linked to the original article[1] instead of codinghorror's summary.

I don't think the article being from 2006 (or 2004) makes it dated (though it does have a date). I think it addresses a fundamental issue with ORMs that will always be there, which makes it a bit of a classic.

David Hang's Django ORM article has a bullet-point section on ORM cons, including "difficult debugging", "performance", "hides underlying SQL". Whereas Neward's article goes into depth on each of the following topics:

The Object-Relational Impedence Mismatch

The Object-to-Table Mapping Problem

The Schema-Ownership Conflict

The Dual-Schema Problem

Entity Identity Issues

The Data Retrieval Mechansim Concern

The Partial-Object Problem and the Load-Time Paradox

If you haven't read the original article, I highly recommend it. You'll learn a bit more about "Vietnam" (from a US perspective), and be better equipped to discuss and make decisions about ORMs afterward.

[My own opinion is that ORMs can be useful (I wouldn't say never use one), but that a programmer should be grounded in SQL and the relational model[2] first, so that they'll know when to, and when not to, use one.]

[1] https://web.archive.org/web/20220823105749/http://blogs.tedn...

[2] https://www.seas.upenn.edu/~zives/03f/cis550/codd.pdf


Yes. Object orientation is the problem here. Relations are (mostly) great to model business logic with. The article you linked mentions this:

> This eliminates the problem quite neatly, because if there are no objects, there is no impedance mismatch.

But:

> Integration of relational concepts into the languages. Developers simply accept that this is a problem that should be solved by the language, not by a library or framework.

> Over the last several years, however, interest in "scripting" languages with far stronger set and list support, like Ruby, has sparked the idea that perhaps another solution is appropriate: bring relational concepts (which, at heart, are set-based) into mainstream programming languages, making it easier to bridge the gap between "sets" and "objects".

I agree that bringing relations into your language is a good thing, but many languages are strong enough to do that as libraries, there's no need for baking support into your language. Just as hash-tables can be implemented as a library in most languages.

We have relations as a library in a Haskell dialect I was working with in a previous job. Worked like a charm, and no language support was needed. Python would also be flexible enough to use relations as a user-defined datatype. And so would many other languages.


orms are exercises in OCD.

Databases are the bottleneck your classic website. We choose to query these databases in a extremely high level language called SQL. This language is so high level that you need to come up with tricks and query analyzers in order to hack the high level query into something performant.

A better abstraction would be one that's a bit more similar to a standard programming language with an std that has query related operations/optimizers that can be composed so programmers can compose and choose query operations and avoid optimization issues that are hidden by high level languages like SQL.

We are unfortunately, sort of stuck with SQL (there are other options, but SQL remains popular because years of development has made it pretty good in spite of the fact that it's a poor initial design choice). This is already a historical mistake that we have to live with. Same with javascript (which has been paved over with typescript), same with CSS, etc. The web is full of this stuff. It's fine. My main problem is the ORM.

The ORM is just another layer of indirection. You already have a high level language you're dealing with, now you want to put Another High level language on top of it? ORMs are basically high level languages that compile into SQL.

What is the point? The ORM isn't actually making things easier because SQL is pretty much as high level of a language you can get outside of having an LLM translating direct english.

The point is OCD. Programmers find it jarring to work with SQL strings inside their beautiful language so they want to chop up a SQL string into web app language primitives that they can compose together. Query builders operate on the same concept but are better because they aren't as high level.

This is basically the main reason why Many programmers experience the strange counter intuitive phenomena about why ORMs actually makes things harder. You have to Hack SQL to optimize it. Now you have to hack another ORM on top of it in order to get it to compile it into the hacked query.


You are correct in your entire assessment, yet, you seem to underestimate the number of boring CRUD applications serviced by mediocre programmers. Limiting the number of technologies required to have at least a little knowledge in is a benefit, even if it hampers performance, because performance doesn’t matter for the vast majority of cases. Software engineers tend to overvalue that bit; the users of line-of-business apps don’t have a say anyway, but what matters is quick adaptability to changes in business logic.

So having an ORM in place that is tightly integrated in Python and Django lets even the junior developer fresh from the bootcamp make changes to an existing application.

It’s not a pretty story, but in my opinion the reason for staggering layers of complexity is the ability to move quickly even without experts on the team.


I had to scroll all the way down to find this. As a CTO who personally never understood the point of ORMs the benefits become very quickly obvious when your organization starts to scale and the prospect of dozens or hundreds of developers of unknown quality hitting your production db with raw sql becomes objectively frightening. Of course there are ways to setup and administer your db to prevent the most obvious footguns, and it is still possible to write bad queries with an ORM, but having that extra layer with limitations gives some extra peace of mind.


> This language is so high level that you need to come up with tricks and query analyzers in order to hack the high level query into something performant.

What? You have to understand a language to write performant code in it. That’s not a hack, that’s basic competence.


> What? You have to understand a language to write performant code in it. That’s not a hack, that’s basic competence.

The poster is not referring to understanding the language, he/she is referring to having to guess at how to structure the query in a way that increases the chances that a good plan is within the constrained search space of plans (due to it being a combinatorial problem and the optimizer has limited time and information).


No it's not. Many language are designed so certain performance aspects are invisible. Take golang and python: The garbage collector.

The garbage collector is something you should not think about when programming and most programmers don't even need to know it exists. You only think about it when you need to seriously optimize things.

For SQL it's a bad design choice because of what SQL is and what SQL is targeting.

SQL is a high level abstraction that automatically determines a query plan based off of high level input. The problem here is that it can choose a bad query plan. So you need to "hack" the query in order to trick the query planner into doing what you want.

It's also targeting the slowest part of the stack: Non volatile memory and IO. The slowest part of the stack should be targeted with a zero cost abstractions to maintain speed while the fastest part of the stack you can use a language like python for your web app it's fine because databases are magnitudes slower.


Point taken re: gc.

For query plans, in nearly every plan flip I’ve encountered, the root cause came down to either poor table / column statistics (because they had let the table grown too large without appropriate tuning), having an enormous amount of joins beyond the deterministic level the planner could provide, or sub-optimal queries. The latter is growing less common as planners get better and better at rewriting on the fly, but it’s always good to know how to do things optimally.

A good example is semijoins / antijoins. With modern versions of both MySQL and Postgres, you’ll probably have “WHERE foo IN (…)” turned into “WHERE EXISTS (SELECT 1…)”, but it’s better to write it the optimal way in the first place.

I will grant you that indexing can be bewildering, with its many rules, caveats, and gotchas.


I’ve used several ORMs and experienced the same things everyone else does—CRUD is great, but good luck with your first actual reporting dashboard.

Then I tried the new generation of typescript SQL query builders that automatically infer types from your database and provide end-to-end type-safe query results, even for highly complex queries. And since then I became convinced the answer isn’t trying to shoehorn classes on top of relational tables, but having a sufficiently integrated and capable query builder that provides properly typed results.


You'd probably really like jooQ, building queries with typed classes and returning strongly-typed results. Of course, it runs on the JVM where languages have actual types, not a facade over an untyped language.


I thought Django’s ORM was awesome (seriously).

And yet using it was such a shit experience I switched permanently to writing SQL.

I see zero advantage to using an ORM over SQL and in fact see many downsides.

Don’t use an ORM just learn SQL. It’s faster, more direct, more powerful, easier to understand, allows you to make full use of the power of the database and your knowledge isn’t suddenly valueless when you go to a project with a different ORM.


Depends on what you are doing. A good ORM lets you create entities from tables, or creates tables from entities.

Good ORMs make sure your data model is always aligned with the DB, and there is no way in hell a runtime error can occur. What's more, it automates the mindless data class/table matching for 99% of your use-cases.

Once you got a more sophisticated query, please use SQL. But for simple stuff, use ORM.


Django’s ORM is the first one that I ever spent a lot of time with. Throughout my career I’ve interacted with other ORMs from time to time. It wasn’t until I’d done that, that I realised, even though it’s not perfect, how fantastic the Django ORM is. I thought they’d all be that good, but no.

I’ve read a lot of criticisms of ORMs, as I’m sure everyone else has. Some of them are certainly valid criticisms that are immovable and just inherent in what an ORM tries to do. Some of them just seem to be caused by not very many ORMs being good, and the writer not having used one of the better ones.


For me Django and ActiveRecord stand out as 2 good examples of what an ORM should be like. Both feel like they make the simple stuff super easy, the complex stuff figure outable, and the super hard stuff trivially possible with raw SQL and a decent mapping from that back to regular code.

Although over the years my code trends more and more towards `.rawSql` or whatever equivalent exists. Even for the simple stuff. It’s just so much easier than first thinking up my query then bending over backwards three times to make it fit into the ORM’s pet syntax.

Plus raw sql is so much easier to copypasta between different tools when you’re debugging.

And before you say “but sql injection!” – that’s what prepared statements/parametrized queries are for.


Same here, but IMO it is related to skill and experience.

Once you get sufficiently familiar with some paradigm the training wheels can come off.

“Raw” SQL is already an abstraction. Over time all the implicit magic will get on your nerves. Trying to shoehorn two completely different worlds into one abstraction is not worth it: you get to learn today’s untransferable funky ORM syntax and idiosyncrasies while losing sight of the actual skill that matters long term which is SQL itself.

I concede however that handling of SQL, the field names, the relations, is annoying. But it’s core to the problem you are probably solving (some form of CRUD). Plumbing is annoying but as a plumber I’d say get used to it instead of wishing to be dancer.

I notice this in other aspects of my work as well. When I switched away trom desktop environment to terminal I had the same feeling. It’s easier, less hassle, less wonky abstractions, more direct. Completely counter to what popular culture is telling me.


> I concede however that handling of SQL, the field names, the relations, is annoying. But it’s core to the problem you are probably solving (some form of CRUD). Plumbing is annoying but as a plumber I’d say get used to it instead of wishing to be dancer.

It feels more like outsourcing said plumbing to someone that has done a lot of it in the past and will in most cases save you time, even if they won’t do everything the way you’d prefer yourself.

Throw in a bit of codegen and reading your current schema (if using schema first approach) and you’re in a pretty nice spot to be, except for the times when ORMs will get confused with non trivial joins and relationships, but typically then you have an escape hatch to have the query be in raw SQL while still mapping the outputs to whatever objects you need.

To be clear, I still think that mapping read only entities/DTOs against specialized database views for non-trivial data selection makes sense a lot of time regardless of the stack (or even some in-database processing with stored procedures), but basic ORM mappings are useful a lot of time too.


> but typically then you have an escape hatch to have the query be in raw SQL while still mapping the outputs to whatever objects you need.

This is precisely why we introduced the "TypedSQL" feature in the Prisma ORM. For those who are interested in reading more on that: https://prisma.io/typedsql


I've started working on a .NET project that uses EntityFramework and due to some of the magic, I prefer just using raw SQL as well. I've used other ORM solutions in the past as well and I am not a fan ...

But the team chose EF due to it supposedly being easier to integrate with whatever database the customer might be using and that seems like valid reasoning.


It's pretty straightforward, and LINQ method names map quite closely to SQL.

If you are not a fan of it however, you can use queries directly with '.FromSql(...)' retaining the convenience of object mapping. Interpolated queries use string interpolation API to convert them to parametrized queries that are injection-safe under the hood.

https://learn.microsoft.com/en-us/ef/core/querying/sql-queri...


Yes I agree. That’s a valid argument, but there are other ways as well like SQLKata and other query builders.

Depending on the complexity of the product I think writing standard SQL92 will get you far as well.

I don’t get how an ORM will handle Postgres’ CTEs and windowing functions. Those are pretty basic and extremely useful features to me, but those require dropping to raw SQL. Each vendor has those useful particularities that you immediately lose by going ORM.

So in practice you are already required to make your queries as bland as possible where GROUP BY and JOIN is about as complex as it will get. I’d say just do SQL92 work around the limitations - which is something you’ll have to do anyway, but now more directly - and all major vendors will work OOB.


> I don’t get how an ORM will handle Postgres’ CTEs and windowing functions. Those are pretty basic and extremely useful features to me, but those require dropping to raw SQL. Each vendor has those useful particularities that you immediately lose by going ORM.

Django has supported windowing functions since 2.0, almost 7 years ago: https://docs.djangoproject.com/en/2.0/ref/models/expressions...

And people are experimenting with CTEs - this looks pretty natural to use, at least with the basic example, and it supports recursive CTEs even if the example looks clunky (though I think that's their choice of example rather than the actual recursive part making it look bad): https://dimagi.github.io/django-cte/ (repo: https://github.com/dimagi/django-cte)


The awkward part is realizing Django has been this good for well over a decade. The core design hasn't (ever?) changed.

I started using Django on I think version 1.2 or 1.3 in 2011, back when it didn't have database migrations and you had to use a library like South for it. Even then, as an ORM/query language it was apparently better than what other languages have now.


The design of the ORM changed substantially once, early in Django's history, when the "magic removal branch" landed. There are some notes on the new syntax that introduced here: https://code.djangoproject.com/wiki/RemovingTheMagic#Codecha...

That branch merged on May 1st 2006: https://www.djangoproject.com/weblog/2006/may/01/magicremova...

I've long found Django's commitment to not breaking too much at once inspiring. The release notes and upgrade guides are really solid.


I used Django a lot from about 2007 to 2010 and, then, went for several years without using it at all. When I came back to it, I was delighted to find that everything still worked like it was supposed to, just better. Congrats on getting it right on the first try. That's...not something that happens very often in software.


Django's ORM is acceptable - but I think Rails/ActiveRecord is superior albeit certainly more opinionated. Most likely just my personal bias speaking because Rails was the first web "framework" I cut my teeth on.


I think Django's ORM is really great but only as long as that's the first thing you have seen.

If you started from Rails and ActiveRecord, you're probably not be very appreciative of Django's ORM.


SQLAlchemy is the best stand alone ORM I have found.


I don't know, the declarative API is awkward. I get SQLAlchemy is quality software but I found DX poor.


sqlalchemy is easily one of the best ORMs ever made for any language. Hats off to zzzeek.

Currently working in typescript and writing SQL queries out by hand since I just don’t trust anything to do the right thing wrt units of work.


i've used both. they both have pros/cons. use whatever tool your team is most proficient with/stop wasting your time arguing about which framework is best


I love the django ORM and then the REST framework on top of that.

Define the object, hook it to a view, use our custom permission class, done.

GET, POST, PATCH, DELETE all urls, filtering, pagination, search, ordering, complex multi-org multi-user object permissions handled automatically. Need custom logic? Just overwrite method.

It’s a productivity superpower, sure there’s a lot to learn but that’s the price you pay for the tools that make you hyper productive


Ninja is even better than the REST framework. Practically no boilerplate.


In 15+ years I’ve only used two orms: Django ORM and SQLAlchemy. I’ve also skimmed docs for many others (some JS orms, etc), and those often look completely unusable.

SQLAlchemy is leagues above and beyond Django ORM. I can’t say I have nightmares from dealing with it, but it certainly was not pleasure. A bit too much magic here and there, not flexible enough and provokes too much bad practices. Re-defining save() methods to do a lot of work, anyone?

The best “orm” that I’ve ever used was not an ORM but a query builder — HoneySQL from Clojure. That one is fantastic. No ORM bullshit, no extra data loaded from DB, no accidental N+1 queries. Not reading docs on how to do a GROUP BY, everything’s just easy to write.

Frankly, we often use SQLAlchemy as just a query builder too. Makes life so much easier.


This is generally my experience as well. I don't love using ORMs but at least Django's is relatively painless and I can generally find the escape hatch I need.

The lack of support until very recently (and it's still lacking) for views is the main knock.


I agree. Django's ORM is great because it handles the relationships well, where many ORMs barely do the object mapping part well.

I'm rewriting a large Django project in Java (quarkus + jooq), because it's at the point where I need a type system now, but it still has a place in my heart.


Aren't Django models close enough to types?


Nope. Python type hinting is far far from Java like types.

And yes, I guess OP has now a large system that needs types enforced by the system to reduce the friction in evolving the stack.


If this is what you're referring to:

> Note: The Python runtime does not enforce function and variable type annotations. They can be used by third party tools such as type checkers, IDEs, linters, etc.

Then yes, Django model definitions are more like Java types in that they error if you try to use an incorrect type. You can't just ignore them like with type hints.

They also try to be 1:1 with database types, so for the most part any additional validation added on top of Django would be something you had to do anyway.


What about using pydantic?


FWIW, one python project I'm working on uses an obscure Framework, and ORM. I was contemplating to convert it to FastAPI+Pydantic, however the amount of effort needed was no different than rewriting the whole project.


I already use type annotations with Python for use within my IDE. It's just all tiring garbage. Java is already almost a scripting language and I can actually use a shared heap, etc. also I have some core code in java I use in the desktop app I want to reuse in the server. Right now I transpile that core code to TS and then JS to use client side, but I'd rather just have everything in Java.


pydantic + sql database?

So have 2 components that ensure the types are correct? Why?


> I thought they’d all be that good, but no.

I had the "pleasure" to use Doctrine once. Never again !


I used Doctrine for a few years, and I remember thinking that it was about 50% awesome and 50% terrible. I wonder what I would think if I came back to it today.


This is largely academic now. LLMs do a good job of writing highly complex queries with the django ORM.

All you need is the django toolbar so you can check their efficiency, then keep telling it to make them more efficient.


LLMs are not at a point where we should be treating them as a solution to any software engineering issue, period.


I've used it to write horrendously complex queries across multiple tables. It can do things I don't know how to and that would have taken significant time to learn. All I know is it works and it's performant when you tell it to be.

It's been particularly helpful for aggregations, subqueries etc. You may not think LLMs are there yet but my project is proof. Try it yourself.


> It can do things I don't know how to and that would have taken significant time to learn. All I know is it works

Until it fucks up and you have no clue how to fix it, and either does the LLM.


Or just read the code like I would if a colleague had written it, or do you think it's chucking out assembler?


What good will reading the code do you when by your own admission you can't understand it?


I said I couldn't write it, not that I suddenly became unable to reason or understand programming.


You said you had no interest in learning it.

If you didn't learn it, you don't know it.

It doesn't matter if you can reason or 'understand programming' if you haven't learned what is necessary to understand this particular piece of programming.

Or is it that you think you can 'reason or understand' any piece of programming? And you possibly assume that is easier, to read what an LLM generates rather learning what is necessary to write it yourself?


Spending time to learn? Nah better never learn and put together barely working stuff that nobody dares to change!

This isn't something new.


Yet people do, will continue doing so, and it works. Trying to stuff the toothpaste back into the tube isn’t going to work, insisting on people just using plain text mail, or stop calling the pound symbol hashtag. It’s just old men yelling at clouds.


And the title of Software Engineer becomes even more meaningless.

Imagine an electrical engineer saying, “I don’t really know what a bridge rectifier is, but I plugged these things into the breadboard where ChatGPT told me, and the output signal looks correct.”


The engineering is in saying "I need to get this data within these performance constraints", not in the now worthless knowledge of exactly how to fetch it.


That’s not engineering, it’s operating. There’s nothing wrong with operators as a career field, but don’t equate it to engineering, and don’t expect to be as highly paid (why should you be? You can’t fix it when it breaks, and you don’t know how it works).


Just because you don't write it doesn't mean you suddently become incapable of understanding it.


Sure, that's the kind of "engineer" that then needs to offload the work entirely onto someone else when things don't magically work as he wanted.

The other person is the engineer here.


You're welcome to use LLM's, but you need to be able to answer for every single line, every single construct, parameter and class before pushing it to production. That is your responsiblity as a developer.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: