Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I have a lot of experience with the Django ORM, but no other.

So I disagree with most points.

The Django ORM does map pretty neatly to SQL (in my opinion), and especially Postgres. The migration tool is anything but "half-baked". And I have observed that beginners have an easier time creating Django models than dealing with straight SQL.

On top of that, the Django ORM mitigates most SQL injection attacks out of the box, without even requiring the developer to know about SQL injections in the first place.

And the ORM enables elegant code reuse by constructing query objects that can be used in Template logic, Form validation, generic CRUD, REST endpoints and more.

I agree that most ORMs don't help you with multiple programming languages. Though both the Django ORM and SQLAlchemy allow to customize a lot of the naming conventions and can be adapted to any legacy schema.

But then again: Is it even a good idea to access the same database from two codebases at all?

Some of the concerns about "ORMs kill the SQL-star" can be boiled down to a lack of a general understanding of relational databases in developers who learn to use an ORM before diving deeper into the philosophy of relational databases.



...you're the ONLY person I saw with a good opinion of Django's ORM. I'm willing to accept you could be right in preferring it, but:

Could you share some resources of using it properly for advanced use cases? Good cookbooks etc.

For, it's always been a huge footgun, coupled with Python's dynamic nature, it's and endless source of bug after bug... Mistype a property on a model instance - no problem, no error, it will just not get saved to the db and data will be lost. Use progres native json and array fields - good luck running your tests on sqlite, the ORM that was helpful up to here just gives up and tells you "f u buddy, you're on your own". Need to write complex hand-coded-sql queries that return complicated data, but you's still want to be able to serialize them into a graph of objects that can then be modified by regular ORM code - good luck with that, nobody cared about this scenario.

They've somehow managed to make it too simple and too complex at the same time! Oh, and don't expect to just click a few jump to definitions an make sense of the ORM's inner code easily, god forbid :|...

While I like the idea of ORMs, I found them all totally "wrong headed" And SQLAlchemy - that's just not worth the effort of learning to use it properly despite its good ideas, if you're only going to end up using 10% of its functionality and in an idiosyncratic way that was not intended.


This thread is littered with people singing praise of Django's ORM. He is not the only one. I'm a huge proponent of it as well.

> Could you share some resources of using it properly for advanced use cases? Good cookbooks etc.

The official documentation covers advanced use cases. Stackoverflow will help you shape up more complicated queries.

> Mistype a property on a model instance - no problem, no error, it will just not get saved to the db and data will be lost.

That's a drawback of most dynamic languages, not a flaw of Django's ORM.

> Need to write complex hand-coded-sql queries that return complicated data, but you's still want to be able to serialize them into a graph of objects that can then be modified by regular ORM code - good luck with that, nobody cared about this scenario.

Maybe that's the threshold Django ORM devs thought sensible to stop supporting? At some point, mixing hand-coded-SQL and ORM code introduce a whole set of hard questions and decisions. Very few people expect ORM code to properly play with hand-coded-SQL. If you need to switch back to SQL, you are generally on your own.

> They've somehow managed to make it too simple and too complex at the same time! Oh, and don't expect to just click a few jump to definitions an make sense of the ORM's inner code easily, god forbid :|...

It's simple to use, yet have complex internals. That's about what I would expect from a tool abstracting something complicated.


>> Mistype a property on a model instance - no problem, no error, it will just not get saved to the db and data will be lost.

> That's a drawback of most dynamic languages, not a flaw of Django's ORM.

In Python, mistyping a name usually results in an exception at runtime. If Django doesn't check models for correctness, it can't be explained by being written in a dynamic language.


In Python, mistyping a property of an object doesn't result in an exception. JS have the same behavior.

If you mistype a property on a nullabe or already filled property of a model instance, Django will happily save the model with the mistyped property (which will be ignored because it's not part of the model) without actually modifying anything in database.

Calling that behavior a flaw of Django's ORM is disingenuous.


> In Python, mistyping a property of an object doesn't result in an exception.

Sure it does. It doesn't on assignment, as you can modify the object however you like, but it does have AttributeError for accessing a property that doesn't exist.


Actually it does on asignments to `__slots__`-based classes.


Which you can only use compile time - not dynamically. That said I recall the sqlalchemy orm reject fields that it did not recognize. A decent IDE (pycharm) will hilight these for you on the Django orm. Having now used both the Django orm and SQLAlchemy extensively I prefer SQLAlchemy a lot. The Django ORM is usable, but many of its behaviours are odd and I agree that writing more complex queries gets ugly fast, unlike with SQLA where you can write some really hairy sql in a composable manner that makes it just so much more legible.


There is no such distinction between compile time and run time in Python; everything can be done dynamically. Want to use a class with slots computed at runtime? No problem:

  >>> def with_slots(**kwargs):
  ...   class SlottedClass(object):
  ...     __slots__ = kwargs.keys()
  ...     def __init__(self, **kwargs):
  ...       for k, v in kwargs.items():
  ...         setattr(self, k, v)
  ...   return SlottedClass(**kwargs)
  ... 
  >>> with_slots(foo=1, bar='qux').zork = 4
  Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
  AttributeError: 'SlottedClass' object has no attribute 'zork'
Of course there are many improvements that could be made to this approach (e.g. not creating a new class for each instance, or setting the __name__ to something meaningful) but I hope it shows that dynamic typing and robust error checking aren't contradictory.


They were talking about assignment.


If you misspell the property when setting the value, there is no error.

Again, this can be caught by eye, by testing and probably soon by static type checkers.


i mean if you try to access undefined attribute in model it will throw an exception


> ...you're the ONLY person I saw with a good opinion of Django's ORM. I'm willing to accept you could be right in preferring it,

No, he's not. Almost the entire Django community has a good opinion of it.

The main criticisms I see of Django's ORM seems to be from people that prefer SQLAlchemy. They have different design philosophys and the distinction seems largely a matter of taste and differing use-cases.


> main criticisms I see of Django's ORM seems to be from people that prefer SQLAlchemy

...then I'm in the minority of hating them both, and in the process of looking for an alternative since I'd be using a lot of python too for the foreseable future :|

I liked web.py's database abstraction layer a long time ago - know of anything similar in philosophy but more powerful?


Massimo still maintans it at https://github.com/web2py/pydal and he's working on py4web at https://github.com/web2py/py4web.

DAL was the first Python db abstraction I used, and I liked it well enough, but I've since used both Django and SQLAlchemy, and they both seem less opaque to me, maybe because not obscured by some of the weird things about web2py itself.

I love the auto-generated backwards references you get in Django and SA when you use foreign key and many to many fields. They make life so much easier. I don't recall how that was achieved in DAL, I just remember it being... complicated.


Years ago I had the same discussion [0] on Medium and here is my take on that: You will end up reinventing "an ORM" because your SQL won't handle all the edges cases. At least with eg. Django ORM we know it helps avoid some security flaws.

[0]: https://medium.com/@iMitwe/been-working-with-django-for-over...


thanks for the reference, especially to the Doordash's original article "Tips for Building High-Quality Django Apps at Scale" that was related to... I really like their thinking and think they are solving for the same problems, really like their idea of abstracting/wrapping over the ORM instead of using signals, and I'm looking for a similar architecture for a "default" Django boilerplate, featuring:

- data models in separate files one for each model, under a data_models/

- business logic models that may or may not map one to one with data models under a models/ that would content roughly framework-independent code

- encourage "fat apps"

- figure out some way to let the db be the source of truth and allow "multiple services, one DB" scenarios - maybe having "core models" for which all db migrations are excluded, table names fixe, and possibly extra columns that the django app could ignore but other services could use, and "app/project models" for which regular django migrations would run

I might write it up after I get the boilerplate app, and I'm really interested in hearing opinions for people with opposing viewpoints especially!


Migrations and integration databases don't mix. One works in a code-first paradigm, the other is database-first.


...of course they don't: I was referring to having a sane standardized way of having each app work with two dbs, one that it owns and includes some models of no interest to other apps, a different one that it shares with other apps and is used by other models (and has migrations turned off for these models, ofc).

The main point is that it can result way less code, think of all the APIs between microservices that would never need to be developed or documented or maintained because they'll never exist, yet the problem they would've solved will be addressed :)

Doesn't work all the time, but if most of what you need is data-sharing, and there's a clear model of "this service produces this data, which has a clear documented fixed format, all others just consume it", you'll prevent thousands of LOCs from being written. And I do see my mission more and more as preventing (useless) code from being written nowadays...


In Django we can have multiple databases using multiple backends and either the same, different, or overlapping ORM models.


Indeed. I recently worked on a legacy project where I don't have the luxury to modify the old database. I ended up using Peewee [0] to access the old database, in conjunction with the normal Django based database and even wrote a simple model that use Peewee [1]. Everything works, even Django Rest Framework associated with the model.

[0]: http://docs.peewee-orm.com/en/latest/ [1]: https://stackoverflow.com/questions/57602206/mysql-gone-away...


Django's ORM is fine. I would say that it's good but not particularly powerful.

Its easy to use and makes inserting / updating data fast and easy. It's simple and fast for simple queries. It starts getting a bit crap when you need complex queries. Annotations / GROUP BY stuff isn't intuitive.

It does make it easy for inexperienced people to generate huge numbers of queries by looping through querysets and getting field from another table.

It lets you run raw SQL through it and prevents SQL injection, so its really a case of knowing when to switch to that.


If you're building SQL with string concatenation of data, you've lost. You're not secure. Give up.

Parameterised queries are straightforward in all competent languages. If Python can't do it natively, switch to a better language that can.


Where did I say I was concatenating data? You might want to take a look at Django's ORM before making comments like that, it handles parametrised queries fine.


The way I see it, an ORM is just a way of formalising the false assumption that a database is stupid and useless.

I've taken a look at a couple of Django's most popular ORM options and I suppose if I was a mid-level corporate cog writing CRUD interfaces for Stack Overflow tutorial-grade operations like customer-sale-item-product relationships it would probably work fine. I've only had passing exposure to that kind of soul-crushing drudge work.

For the way I work an ORM is baby stuff, and I'm not even doing anything particularly advanced or sophisticated. I maintain (among other things) a relatively unremarkable bespoke online discussion forum. It comprises hundreds of queries, few of which could be composed correctly by an ORM, let alone composed and run performantly. The median complexity query in my code base probably has two or three joins and two or three subqueries—at least one of which has some kind of tangential aggregation or window function.


> For the way I work an ORM is baby stuff

https://www.reddit.com/r/iamverysmart/

> a relatively unremarkable bespoke online discussion forum.

>It comprises has hundreds of queries

Sounds like you have a design issue. Making the unremarkable complex.


Thanks for the well deserved lesson in internet arguments. You win, sir.

(For the record, complexity and remarkability aren't strong correlates. And it's not particularly complex; probably 90% of the app's queries are written for indirect tasks like moderation tools, automations, statistical reports and various stratum of anti-abuse mechanisms. Most common pages only have one security-related query and one content-related query.)


> progres native json and array fields - good luck running your tests on sqlite

you're not testing what you actually run in production, then what are you testing? If you just need to test in-mem behaviour, mock the DB completely; instead of expecting compatibility for vendor specific features across vendors.


For database testing, I often use a "ramdisk", or in-memory filesystem: sudo mount -t ramfs -o size=512m ramfs /path/to/mount/point, initdb, pg_ctl start. (There's a bit more bookkeeping to generate passwords, store them to /path/to/mount/point/.pgpass, etc.) You can tear it down afterwards with pg_ctl stop, sudo umount /path/to/mount/point. I'll put the necessary commands in start / stop scripts, then call those from within the beforeAll / afterAll hooks of whatever testing framework I'm using.

It's definitely slower than mocking, and therefore not really suitable for unit tests - but it's great for repeatable, isolated integration tests (continuous or otherwise).


Sounds like a job for docker or CI systems.


Hmm... can one test use in-mem db mocking with django.contrib.postgres fields?

I'll definitely look into this, not a solution for existing projects bc of legacy baggage, but probably the right way to do it. Problem in practice is that you yes what you really want is to mock the db completely, but practically speaking existing code is to entangled with the db adapter specifics, so you end up with running with a "fake real db" :|


Can you write the tests so that you create object instances and not save them to a database at all? You don't need to test the ORM and database, after all, just your code.


The closer you keep things to production, the more likely you are to catch errors that will occur on production.


unit tests, yeah, but if you try to get "the most bang for the buck" in like "test a lot with zero/little effort" you'll end up with a layer of "integration tests": that (1) set stuff in db, (2) do stuff, (3) read stuff from db and assert its correctness... A sane ORM would not actively hinder this scenario even cross-dbs or with memory-mocked-dbs, but Django's does!


I don't see the issue - you're asking it to support a feature of one database on another, something it never claims to do. You need to use the same database for testing that you use in production.


I'd say the sqlite backend and using it in testing is mostly useful for reusable django apps that are supposed to be used in a wide variety of projects.

Most of the code I write in Django would never need to be SQLite-compatible, even if a lot of it would be. The SQLite compatibility is almost like a relict from a time when setting up RDBMSs was difficult and slow.


> Most of the code I write in Django would never need to be SQLite-compatible,

Then don't run your unit tests against SQLite, run them against Postgres.


If so, don't use things specifically in the Postgres contrib package of Django? The package is built that way for a reason


I don't know if there is already some plugin to make mypy aware of Django models, but that should be possible and would avoid bugs where you misspelled a property and it slipped through your quality control and unit-test.

I suspect "raw sql" to be a bigger headache and source of bugs in any moderately complex project.

Most of the use-cases of queries are actually very simple, and there the ORM allows you to simplify the code a lot.

My hints how to enjoy the ORM more: Write custom managers and QuerySets (for example for commonly used filters), write generic views that take QuerySets, take advantage of Models and QuerySet in Form Validation and REST APIs.

If you use the Django ORM like you would concatenate raw SQL, then it ain't gonna be pretty...


> Most of the use-cases of queries are actually very simple

I write web applications / data tools with raw SQL and I find that only about a quarter of the queries are simple. Many have subtle specificity to the joins, subqueries and window functions that return for me almost exactly what I want to output.

The problem I see in most people's ORM-powered code is that they lean so heavily on application code to get all the data assembled in the right place and in the right order, resulting in scaling problems

> concatenate raw SQL

Often dangerous and almost never needed. Parameterised SQL is the right approach.


> If you use the Django ORM like you would concatenate raw SQL, then it ain't gonna be pretty...

Depends on what exactly you mean here. The Django ORM fully supports some things you wouldn't initially expect if you hadn't seen it before, like passing one queryset into the filter args of another. It'll actually compose the queries and run it once on the database, instead of running the first and passing the result into the second.


> Mistype a property on a model instance - no problem, no error, it will just not get saved to the db and data will be lost.

Test your code. Use an IDE. PyCharm highlights these errors immediately.

> Use progres native json and array fields - good luck running your tests on sqlite,

You're using stuff from the Postgres contrib package on a different db. It's pretty clear in the docs about that

> Need to write complex hand-coded-sql queries that return complicated data, but you's still want to be able to serialize them into a graph of objects that can then be modified by regular ORM code

Make a view and manage it through the ORM? If you're updating data from this, then what you're doing seems really strange.


I'll leave the other parts since sibling comments responded to those, but:

> Need to write complex hand-coded-sql queries that return complicated data, but you's still want to be able to serialize them into a graph of objects that can then be modified by regular ORM code - good luck with that, nobody cared about this scenario.

It's been built-in since at least 1.7, and I think I remember it since 1.4: https://docs.djangoproject.com/en/2.2/topics/db/sql/


I am the other person with an excellent opinion of Django's ORM. I vastly prefer it to writing SQL queries manually, and the only thing I miss is making it ever more functional-like.


> prefer it to writing SQL queries manually

Everybody prefers anything that does a barely decent job at preventing you from writing SQL manually :)

If you set the bar that low then sure, you'll have an "excellent opinion" of ANY ORM. Can you give an example of an ORM you didn't prefer to writing SQL manually for 50% of the cases? - most that were that bad are probably not in usage anymore...


I'll chime in as someone who has used Django and its ORM for about 5 years and I have to say it's a pleasure to use.


My experience as well. I use an ORM that I absolutely love for most work (Apache Cayenne) and I use plain SQL where appropriate, like for complex reporting.

The "ORMs are bad" articles usually boil down to "My ORM is bad" or "I tried to use an ORM where it doesn't really fit and now I believe all ORMs are bad".


Even for "complex reporting" you can still use ORMs if you save your SQL query as a VIEW in the database. Then you use ORM for reading the output.


Or just use the ORM directly with a raw query, on a model that represents the results: https://docs.djangoproject.com/en/2.2/topics/db/sql/


That's exactly what I do with Apache Cayenne.


> Some of the concerns about "ORMs kill the SQL-star"

This doesn't even make sense. It kills the SQL duds, stars are even more important to understand the convoluted SQL when the ORM isn't working as expected, and cases where the ORM can't handle at all (there are always some).


> there are always some

That's not my experience. In ~20 projects in this company, with pretty complicated GIS-related web apps, there have been about 4 cases where I need to create a raw SQL query because Django's ORM didn't do what I needed. Almost all the time, it just works.


When you say "pretty complicated GIS-related web apps" does it mean you perform powerful spatial analysis using PostgGIS extension over a few millions spatial objects OR is it more about fetching a handful of spatial objects for display/processing in web-mapping framework?

Because for pretty obvious reasons in the first case you'd want computation on the back-end. Sure you can do a lot of spatial operations with python after passing through an ORM... but is this really more adapted than raw SQL & PostGIS? Genuinely curious.


Most of the processing happens outside PostGIS, we often compute things with geometries from PostGIS to raster data that we don't keep in PostGIS. So our Postgis queries are mostly limited to filtering on a BBOX, or closest distance to a point.

What users want to compute can be defined by users, e.g., "take all houses in the Netherlands as polygons, compare with a DEM of the Netherlands to find the lowest point of the house's contour, compare that point to a water depth raster to see if it would be inundated" is a computation a user could define in our GUI and the actual computation would then run in (Python) background tasks.


Thanks for your reply.

I work in GIS field and I tend to put a lot of business logic into PostgreSQL because I can re-use all functions/view in a variety of tool from Desktop GIS, BI and even (shamefully) Excel for some our oldest App.

Whenever we'll move to a new webmap or I we'd want to serve an API it's always one connector away. All business logic put into the DB is heavily reusable because a lot of tool "speak" SQL.

I guess in the end it really depend on wether you approach is data centric vs functionality centric.


Wow, where do you work?


Coming from Rails I think that Django's ORM is too complicated, like almost everything in Django and in Python. However I'm using Python, Ruby and Elixir in different projects and Ecto is much more complicated. It makes me want to write plain SQL as I did in the early 90s.

Actually I'd be OK writing SQL. It's just inconvenient having to decode the results and handle migrations manually. I did plain SQL migrations before ORMs. That's not something I want to be back to.


Django's ORM is pretty complicated under the hood, but I suspect that this is not easily avoidable given the requirements.

I also don't subscribe to the view that Django is too complicated. With other frameworks you end up implementing part of Django, and most of the time that will be worse in every conceivable way.


Ecto is just a dsl for writing SQL queries, not an ORM. It’s difficult to understand if you think it’s an ORM...


This. It's not harder, it's different. Takes some getting used to, but then you realize "oh it's pretty much just sql that returns structs" and you instantly hate every other ORM you've used prior.


I'm not surprised about your comment on Ecto. Ecto takes experience and practice, but by the time you finish writing your 5th or 6th application, Ecto will be so natural that you wouldn't want to go back. Especially not to writing vanilla SQL.


The problem with complicate things is that the second application never comes. I've got a customer that's been paying me to write a Phoenix application for the last two years. That's good but I'm not going to use Ecto in one of my own projects. So no Elixir if it touches a database. There are simpler alternatives to Ecto, maybe I'll use one of them but Ruby and ActiveRecord are good almost for everything.


> your 5th or 6th application

s/application/day/


I used to really dislike django ORMs since I had a background in SQL and thought that ORMs are really constraining and not as flexible as just writing SQLs. However, recently I have come into the realisation that not everyone who programs knows SQL (woah, what a revelation right?), and the django ORM is a really nice tool to fill that gap for them with something they are more familiar with (ie python objects).


If you are using any language other than SQL, then you better have a way to convert SQL data to the appropriate data types of your language and vice versa. And it better be injection-proof and reusable. That's already at least half of an ORM right there.

I think I know SQL fairly well, but I don't see how I could implement things like django admin, form validation, template views etc as elegantly on top of plain SQL.

So I really don't agree with "ORMs are only useful for SQL-noobs".


> But then again: Is it even a good idea to access the same database from two codebases at all?

A motivation for relational algebra was different applications accessing data in different ways. This integration remains a selling point. Further, data often outlasts applications... even languages.


Still, I think accessing a single database in RDBMs like Postgres or MySQL from multiple codebases or even languages means asking for trouble.

Providing some kind of API as a library, REST, RPC or whatever would almost always be preferable, if only to centralize permissions and validation.


the problem is that these days you need a lot of application code to check for permissions and that kind of stuff.


In principle permissions can go in the DB.


In practice this is rarely ever done (right).

Row-level permissions are possible, but it can get more complicated when certain users are only allowed to change a certain row in a certain kind of way but not in another.


I agree, but it does seem to become more compelling if you're doing your integration in the database.


"Is it even a good idea to access the same database from two codebases at all?"

Yeah, if you're migrating code between languages and platforms and want to run both simultaneously so you can go live with a subset in the new platform and phase out the legacy system in stages.


In that scenario there is no benefit to raw SQL because you might as well write the new Application with another/better ORM?


No rewriting. Program must be migrated as-is with as little manual change as possible. If the customer wants to make changes later that's up to them.


But then again: Is it even a good idea to access the same database from two codebases at all?

I think that depends entirely on who you ask. Some orgs may have dedicated DB guys who are designing all the DB schema and stored procedures that must be vetted by the DB guys before the application guys can merge in changes. Others might decide dev teams should be responsible for their own data and store it as they see fit. I think product data should only be updated by one main API and that API exposed internally for other tools to interact with, unless it's something that is special. And in development, there are many special cases.


Microservices communicating via an RDBMS sounds like a recipe for chaos.


Came here to say this, specifically about Django too. It's pretty bulletproof these days.


> But then again: Is it even a good idea to access the same database from two codebases at all?

It's an integration pattern. It works in some contexts.


If its treated correctly as a set of SQL language API-s with clear ownership and explicitly defined public/private parts. There is a reason why its often considered an anti-pattern, and it is a general lack of discipline in maintaining stable semantics and separation of responsibilities.


why using an SQL DB if you don't know SQL?


Because web frameworks, mostly.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: