Why wouldn't you be able to take advantage of those systems? I think it's quite ...

int_19h · on Dec 20, 2018

Because of how different they are, taking advantage of them requires using them in specific ways that are also very different. If you have an abstraction layer that hides those differences, it's practically a given that it does so with the lowest common denominator approach, where you get all the flaws of both and none of their unique benefits.

(Alternatively, people find ways to use the abstraction layer such that it produces the desired usage pattern. Of course, then the code is no longer truly portable to another DB, because that same pattern is likely to be a perf issue there.)

vincnetas · on Dec 20, 2018

I'm pretty sure you can have one interface for YeSQL an NoSQL.

* getUsers() * saveUser() * listUsers() * deleteUser()

and implementation can implement it in specific way to take advantage of chosen technology. If you have some esoteric use cases they could be handled in special way separate from business code.

kstrauser · on Dec 20, 2018

For that specific use case, maybe. Where it falls apart is in `searchUser`, where the methods (and performance characteristics) of digging through the respective databases are going to be radically different. In a newspaper's implementation, you're going to have to search by date, subject, keyword, body text string, reporter, etc. etc. In MongoDB, that generally involves creating an index on the combination of fields you'll be searching together. In SQL, that generally looks like adding a `where reporter = "Jane Smith"` predicate. The MongoDB version may be faster if you have an enormous amount of data spread across a cluster. The PostgreSQL version will be more flexible when your boss wants to know how many reporters wrote stories about candied yams within three days of the publication of any story containing the word "Thanksgiving".

Being tasked to come up with an abstraction layer that supports the speed of precomputed, clustered indexes with the flexibility of SQL - if I were in a content creation business and not a database engine writing business - sounds like the kind of project that would make me quit my job and go on a one year silent meditation retreat.

zorga · on Dec 21, 2018

That objection doesn't make sense. Queries are nothing more than a tree of predicates, how the backend end uses those predicates is not relevant to the API of specifying the predicates. Things like indexes whether in Mongo or in SQL are implementation details that can easily be hidden and not infect the API. You can interpret the tree of predicates into a SQL where clause or into a Mongo index search.

The OP is correct, your app can speak to an internal API without the underlying database infecting your domain code. That in no way implies you can't take advantage of the best of each database.

int_19h · on Dec 21, 2018

This is all rosy in theory. In practice, the way you write the query matters quite a bit. Often even between different SQL implementations.

And it's not just queries. Transactions often have important semantic differences that will be visible on application layer - again, even between different SQL implementations (e.g. MVCC vs locks).

zorga · on Dec 21, 2018

> In practice, the way you write the query matters quite a bit.

Which is hidden in the query interpreter for said db implementation. Each implementation can break down that abstract query into whatever implementation specific query works best in that database.

There's always some abstract way to represent it that doesn't require vendor specific knowledge nor does it remove the ability to apply vendor specific abilities.

Look, I just don't agree with you, I agree with OP. Db specific stuff should be hidden from the domain layer by an abstract query representation and an abstract transaction representation to be plugged in at a later time.

int_19h · on Dec 22, 2018

Have you ever implemented an ORM? I did.

Stuff like "each implementation can break down that abstract query into whatever implementation specific query works best in that database" is wishful thinking. It's like saying that Java is faster than C++, in theory, because JIT can produce better code. And in theory, it can. In practice, we're not there yet. Same thing with high-level database abstractions - they're all either leaky in subtle ways, or they constrain you to extremely basic operations that can be automatically implemented efficiently on everything (but e.g. forget joins).

zorga · on Dec 22, 2018

> Have you ever implemented an ORM? I did.

Several actually, which is why I know what I'm talking about; I've explored this area extensively. When Fowler first released PEAA I dug and went nuts and spent years coding up and exploring all the possible approaches and figuring out which ones I liked and why and which ones I didn't and why.

> Same thing with high-level database abstractions - they're all either leaky in subtle ways, or they constrain you to extremely basic operations that can be automatically implemented efficiently on everything (but e.g. forget joins).

If you're doing joins in your ORM, frankly, you're doing it wrong. Most ORM's do it wrong, they try and replace what a db does best; the right way to do it is to keep joins in the db. The role of an ORM when used properly is to map tables and views into objects and allow querying over those tables and views with an abstract query syntax. Joins belong in a view, not in code. It's called the object relational impedance mismatch for a reason, you have to draw a line in a reasonable place to get anything reasonable to work well and putting joins into the ORM is crossing that line and is why most ORM's utterly suck. Joins aren't queries, they're projections; put the queries in the code and the projections into the database, this works perfectly and lets each side do what it does best. Queries are easily abstracted, projections are not, projections don't belong in the ORM.

int_19h · on Dec 23, 2018

Any language with named tuples has a type system that is sufficiently expressive to handle joins without any sort of impedance mismatch. So, the only reason to avoid them is exactly the one that I cited earlier - the underlying implementation difference between databases.

zorga · on Dec 23, 2018

> Any language with named tuples has a type system that is sufficiently expressive to handle joins without any sort of impedance mismatch

Incorrect. Named tuples will give you nothing back but a result set; the impedance mismatch refers to the mismatch between result sets and a domain model; getting tuples back doesn't remotely address this problem. I'd suggest you don't understand what the objection relational impedance mismatch problem actually is.

It's not a type system problem, it's fundamental mismatch between the relational paradigm and the object oriented paradigm. If a domain model has customers and addresses, and you do a relational query that joins the customer table and address table to return only the customer name and address city, the resulting set (name, city) doesn't map to the domain objects and isn't enough data for the domain model to load either of those objects which may contain various business rules. This is what the impedance mismatch refers to, relational projections of new result sets simply do not map to the OO way of doing things. Joins that create new projections are a relational concept that have no place in the object oriented world view: objects don't do joins, and object queries don't return differently shaped objects.

Hacks like partial loading of domain objects are attempts to mitigate the impedance mistmatch, but they do not solve it; they cannot solve what is a fundamental difference between two different ways of seeing data. Data is primary in the relational model and its shape can change on a per query basis, this is incompatible with the object oriented view of the world in which whole objects are primary and data is encapsulated and thus hidden.

The object relational impedance mismatch does not refer to a language problem, it refers to a difference in paradigm between OO and relational. It exists in all language regardless of the languages abilities and it's not a problem that can be solved, only mitigated, if you want to use both paradigms. You can solve the problem by avoiding using two paradigms, by either bringing the relational model into the application and not using OO or by using an object database.

kstrauser · on Dec 25, 2018

I don't think there's a point in arguing this further. The other poster seems unwilling to understand that there are differences it databases within a category, let alone that there are multiple categories of databases with entirely different characteristics.

dsirola · on Dec 21, 2018

It's not a common denominator. If I need a Customer aggregate or a Payment aggregate it doesn't matter if I get it from SQL or Document database as long as I get that aggregate. My code doesn't care about query implementation.

kstrauser · on Dec 20, 2018

FedEx needs both big, slow, high-volume trucks for moving stuff between cities, and small, nimble tricks for delivering to the doorstep. Someone decides that it’s inefficient to maintain two separate standards: any driver should be able to get into any available vehicle and have it Just Work for whatever job they have at hand, right? So they decide to make a single vehicle that can fulfill all roles.

Well, a vehicle that can squeeze down an alley won’t have the cargo room of a giant highway truck. The inter-city drivers will hate its poor capacity. Likewise, one that has a big 20-speed transmission for for hauling heavy loads is going to drive the city drivers nuts. They’re going to end up with one single interface to all possible roadways that everyone can come together and agree to hate.

If the database API is so free-form that you can store anything in it, you won’t get the advantages of PostgreSQL’s strict typing and lightning fast joins. If you make it so regimented that your data model ends up looking like a set of tables with foreign keys, then it won’t be able to make full use of MongoDB’s... whatever it does well.

They’re different animals. Choosing one highly affects the rest of your system design, from how you arrange your data to how you add new data to how you search for it. PostgreSQL and MongoDB have fundamentally different strengths and weaknesses, and if you make something that works equally well with both, it’s inherently going to suck equally on either.