Hacker News new | past | comments | ask | show | jobs | submit login

Well.

I've written at length about this in the past, and basically my opinion comes down to there being two ways to approach this. One is driven by the application, the other is driven by the database.

The application-driven approach is, I believe, the vastly more common use case: you have some things in your code (objects, data structures, whatever) and want some way to persist them and fetch them back later according to their properties, so you use a SQL-based DB because, hey, you can get them for zero monetary cost (PostgreSQL, MySQL, SQLite, etc.), every language has ways to speak SQL and every developer on your team knows SQL.

The database-driven approach is, in my experience, less common: you start with the database, design it meticulously, take full advantage of the relational model and the features it and your DB offer, build as much as possible in the DB layer and then let people write applications which talk to it.

Both you, and the author of the reply in the email this thread links to, seem to inhabit the latter use case. And that's fine; I know people who do that stuff and I respect it. Where the problem comes in is that you have a tendency to assume that your use case and your approach are or should be the only use case and approach. Which is, frankly, wrong. There are plenty of situations where the application-driven approach does just fine and will take you a very, very long way (and when it breaks down, it probably will not break down because you need to switch to the database-driven approach; it'll break down because of other things), and where the database-driven approach really isn't a great fit (for various reasons).




To be fair to the author of the email, he didn't entirely assume his approach was the only good one, since he did suggest: "either use the data, and structure your data (at that layer) to take advantage of it, or don't use a database." e.g. if what you want is to persist objects, then use something other than a relational database. Perhaps one problem is that most programming environments don't have an obvious choice for this so MySQL or equivalent just jumps out as being the ideal candidate. A very brief amount of googling for various permutations of "scalable object store" or "object oriented database" didn't seem to return anything useful. I use Erlang which has an integrated DBMS that can store arbitrarily complex types without marshalling. Are there some equivalents on other platforms?


I don't disagree with your application driven vs data driven distinction and I agree that it's impossible to say in general which one is better. My experience is that it is rare for an application to stand on its own for long. Data has a longer life span than application code, but anyway, both scenarios do exist and both are legitimate.

But making this distinction says little about what are good design principles in order to reduce the complexity of a system. My opinion is that data and relationships among data items should be represented uniformly on the application level as well as on a broader data management level.

Creating one API per combination of attributes leads to a combinatorial explosion, an increase in coupling and unnecessary mental load. It makes generic transformation of data and querying very difficult indeed. We need to work with few general purpose data structures that are easy to reason about.

And no, I'm not assuming that my approach should or even could be the only one. But I think the standard reply of "one size doesn't fit all" or "use the right tool for the job" has become all too fashionable. There are many tools for the same job, so I have to have an opinion and I have to choose. I'm not an authoritarian person at all, so I don't care the least if you make a different choice ;-)


Except you're falling right into that worldview.

You seem to be arguing for databases which exist and are structured independently of any applications which happen to access them. Personally, I think this makes about as much sense as arguing for, say electrons to have well-defined properties independently of anyone trying to measure them, which is the same as saying that there is no such thing.

You also seem to have trouble accepting that there may be lots of situations where there is exactly one application, and if that application goes away, then so does the company (or the department, or the project). In those cases, I don't see much value in trying to make the database be independent of the application; the database exists to serve that application, and if they happen to be tightly coupled to each other, so be it: sometimes that's how you get something to work.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: