Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> In any sufficiently complex application, something like a query builder that allows different parts of the application to work together to lazily compose queries

I don't see this as a given and I don't accept it in my own applications. If you need data, go to the data access layer. If it doesn't provide you what you need, build a new repository / provider / whatever your pattern is.

Whether it's an ORM or a home-built query compositor or whatever, one thing I know from experience is that once your application is "sufficiently complex" that you start to (incorrectly) believe you need this, your application has become too complex to use it reliably.

You absolutely will be mistakenly evaluating these builders in the wrong layers, iterating results without realising you're generating N+1 queries, etc.

You don't need an ORM.



> I don't see this as a given and I don't accept it in my own applications. If you need data, go to the data access layer. If it doesn't provide you what you need, build a new repository / provider / whatever your pattern is.

Whichever layer you put it in, you need a way to compose two query fragments. Otherwise you have to write N*M queries manually instead of N+M query fragments.

Either you use an ORM to help you with this or you don't, with all the usual tradeoffs about using a library or not. But you still have to solve the problem. (Or you do a lot of tedious copy-paste work - with all the usual tradeoffs of that)


usually, every interaction with your database is its own unique code path and query

it's pretty rare for queries to be dynamically composed from arbitrary sub-queries

if this is a problem you need to solve then ORMs certainly make more sense, but even in this case I find query builders to be more effective

the interface between the application and the DB is actually a string! it's not an abstract data type, it doesn't benefit from being modeled by types


You never built admin interfaces, faceted search or dynamic query filters?


sure, sometimes, rarely -- these are exceptions, not rules

in general, it should not be possible for user input to produce arbitrarily complex queries against your database

each input element in an HTML form should map to a well-defined parameter of a SQL query builder, like, you shouldn't be dynamically composing sub-queries based on the value of a text field, the value should add a where or join or whatever other clause to the single well-defined query

sometimes this isn't possible but these should be super rare exceptions


I prefer using something like Rails or Django to build 10 fully working CRUD interfaces with well-defined yet dynamic filters in a day instead of spending two weeks needlessly writing the equivalent code by hand.


why would it take you two weeks to write 10 SQL simple queries?


10 simple CRUDs you mean? With dynamic filters, admin UI, auth, tables, and so on? Because these frameworks allow you to do that in a single day.


i'm running out of ways to say that a CRUD endpoint should not have dynamism in the sense that you mean

/users/:id should map to 1 endpoint that's parameterized on userid

/search?userid=:userid&tag=:tag should map to 1 endpoint that's parameterized on userid and tag(s)

endpoints should be simple to write


You’ve never actually implemented a real world implementation, have you?

You’re going to have parameters that are compound. You’re going to end up filtering on objects 3 relations removed, or deal with nasty syncing of normalization. You’ll have endpoints with generic relations, like file uploads, where the parent isnt a foreign key.

It’s going to be a mess. They will NOT always be simple to write.


Yeah and programs should also be simple, but there would be no value to them that way.


Have you ever heard of APIs?


> it's pretty rare for queries to be dynamically composed from arbitrary sub-queries

I'm talking static, not dynamic. You still need to compose two pieces together into a single query, and you can either use an ORM to help with that or not.

> the interface between the application and the DB is actually a string! it's not an abstract data type, it doesn't benefit from being modeled by types

No it isn't. You can't send an arbitrary string to the database and expect it to work. At the very least you benefit from having an interface that's structured enough to tell you whether your parentheses are balanced and your quotes are matched rather than having to figure that out at runtime.


huh?

when your app queries the db, the query is not composed from several pieces, it is well-defined in the relevant method

    fn search(q string) -> result
        return db.query(`SELECT id, text FROM table WHERE text LIKE $1;`, q)
this is a single query, not multiple

the db accepts a string and parses it to an AST, it does not accept a typed value

this means the interface is the string

unbalanced parens and whatever other invalid syntax is obviously caught by tests


> when your app queries the db, the query is not composed from several pieces, it is well-defined in the relevant method

> this is a single query, not multiple

And when you want to query for multiple related things together, the whole point of having a relational database? For different purposes you need different views on your data, and those views are generally constructed out of a bunch of shared fragments; you can either figure out a way to share them, or copy-paste them everywhere you use them.

> the db accepts a string and parses it to an AST, it does not accept a typed value

> this means the interface is the string

The DB accepts a structured query, not a string. It might be represented as a string on the wire, but if that was what mattered then we'd use byte arrays for all our variables since everything's a byte array at runtime.

> unbalanced parens and whatever other invalid syntax is obviously caught by tests

Tests are a poor substitute for types.


i'm not sure what you're thinking about when you say "multiple related things"

every "view" on your DB should be modeled as a separate function

every possible "thing" that's input to a function which queries the database should be transformed into a part of the query string by that function

> The DB accepts a structured query, not a string. It might be represented as a string on the wire, but if that was what mattered then we'd use byte arrays for all our variables since everything's a byte array at runtime.

...no

the API literally receives a string and passes it directly to the DB's query parser

if the DB accepted structured queries, then the API would rely on something like protobuf to parse raw bytes to native types -- it doesn't

like `echo SELECT * FROM whatever; | psql` does not parse the `SELECT * FROM whatever;` string to a structured type, it sends the string directly to the database


> every "view" on your DB should be modeled as a separate function

OK, and when a significant amount of what those functions do is shared, how do you share it? (E.g. imagine we're building, IDK, some kind of CMS, and in one view we have authors (based on some criteria) and posts by those authors, and in another we have tags and posts under those tags, and in another we have date ranges and posts in that date range. How do you share the "fetching posts" bit of SQL between those three different (parameterized) queries?)

> if the DB accepted structured queries, then the API would rely on something like protobuf to parse raw bytes to native types -- it doesn't

> like `echo SELECT * FROM whatever; | psql` does not parse the `SELECT * FROM whatever;` string to a structured type, it sends the string directly to the database

In both cases the parsing happens on the server, not the client. "echo abcde | psql" and "curl -D abcde http://my-protobuf-service/" are both doing the same kind of thing - passing an unstructured string to a server which will fail to parse it and give some kind of error - and both equally useless.


> How do you share the "fetching posts" bit of SQL between those three different (parameterized) queries?)

your application has a fetch posts method, that method takes input including (optional) author(s), tag(s), etc., it builds a query that includes WHERE clauses for every provided parameter

the code that converts an author to a WHERE clause can be a function, the point is it outputs a string, or something that is input to a builder and results in a string

i'm not sure what a "fetching posts bit of SQL" is, a query selects specific rows, qualified by where clauses that filter the result set, joins that modify it, etc.


> the code that converts an author to a WHERE clause can be a function, the point is it outputs a string, or something that is input to a builder and results in a string

So you do string->string processing, and you explicitly won't treat the queries you're generating as structured values? Enjoy your rampant SQL injection vulnerabilities.


no?

creating a query string that's parameterized on input usually means you model those input parameters as `?` or `$1` or whatever, and provide them explicitly as part of the query

nobody is doing printfs of values


Ok so now your WHERE clauses are no longer strings, you have to have some structured representation of them that knows which parameters go with which clauses, and something that understands the structure of your queries enough to line up the parameters when you stitch together different WHERE clauses to make your full query - exactly the kind of thing you were saying was unnecessary.


Wait, why does every query fragment need to compose with every other? What are the N and Ms here?


Say you create a nice little utility function that makes a call to the db func1 and you write func2 that also makes a db query. If you have a func3 that needs func1 and func2 conceptually your options are:

* Accept that this will just be two trips to the DB.

* Write a new function func12 that writes a query to return all the needed data in one query and use that instead.

* Have your tool be able to automatically compose the queries.

If you do with the second option you have to do that with every combination of functions that end up being used together which is multiplicative in general.


Well sure but that not necessary for every combination of every call. Most likely those types of combinations are rare.


This is a problem, seems like ctes and views are a possible ormless solution here


CTEs and views are both a bit bigger than the parts that you would normally want to share and reuse, and they're also not very well standardised. Plus no-one really agrees on how dynamic tables should be, so you end up with the Excel maintainability problem of no real separation between code and data.


Following this advice too closely and your application logic starts leaking into your data layer.

One simple example is when you need to make atomic changes to two different types of entities. In the data layer, this is usually trivial — just run two queries within a transaction — but you need to expose a function that boils down to `updateBothAandB`. Rinse and repeat enough times and your data layer is a soup of business logic.


Exactly, in my opinion, separating your business logic from your data access layer completely is futile. You will need to either have business logic in your data access layer or fine grained query controls (when to add a filter clause, when to execute a query, what should go into a transaction, etc) in your business logic for performance reasons. This is OK, it's incidental complexity.


There are patterns to resolve this. Depending how you structure your DAL, your data access classes can provide transaction handles. It's a data access layer, you don't need to abstract away the fact that you're dealing with a database.


I mean, yeah, but the patterns all look like the thing you said you don’t accept in your own applications: “something like a query builder that allows different parts of the application to work together to lazily compose queries”.


IME young businesses evolve so quickly too much abstraction is a higher risk that some bleeding among DAL and business rules.


How is this not just a hand rolled ORM?


An ORM is a specific kind of data layer. It is general-purpose, and as the name suggests, it is an Object-Relational Mapper whose primary purpose is to map relations to objects and vice-versa.

So no, bespoke data access layer code is not an ORM.


Why are a data access layer and an ORM mutually exclusive? An ORM is just an abstraction over your database. You might find use in it as a tool to access your database but contain the use of the ORM to within your data access layer.


I never said they are mutually exclusive. I said you don't need the ORM.


Mind sharing an example of a large(ish) app that doesn't make use of an ORM? Last time this topic came up, I went looking for one (admittedly not too hard) and I came up empty handed.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: