Complexity? 1) Parsing SQL is complex, and full of decades of warts in expectati...

derefr · on Oct 4, 2018

> There are few "drop in" SQL parser libraries that are trustworthy and support much more than a bare subset of the language.

Libraries, no, but you don't need a parser library to parse SQL; SQL [even in its DBMS-specific dialects] is a regular language, and regular languages can be entirely formalized using a declarative grammar. You just need a grammar file in a standard format, and a parser-generator like GNU Bison to throw it through.

Conveniently, existing FOSS RDBMSes like Postgres already generate their parsers from grammars, so they have grammar files just laying around for you to reuse (or customize to suit your needs): https://github.com/postgres/postgres/blob/master/src/backend...

> Query Planning/Optimization is the "secret sauce" of most SQL databases, and is extremely hard in general. To use SQL as a general API outside of an implementation tied to a specific SQL vendor you'd presumably need a "general" query planner/optimizer that becomes a mini-version of an SQL database in itself.

Definitely the bigger problem of the two.

Personally, I would suggest reversing the whole problem formulation. You don't need to take your existing system and make it "speak SQL." Instead, it's much simpler (and, in the end, more performant) to take an existing, robust RDBMS and hollow it out into this "mini-version" that queries through to your service.

I've personally done this before: I created a Postgres Foreign Data Wrapper for my data service, then stood up a Postgres instance which has my data-source mounted into it as a set of foreign tables. I then created indices, views, stored procedures, etc. within Postgres, on top of these foreign tables; and then gave people the connection information for a limited user on the Postgres server, calling that "my service's SQL API."

In this setup, all the query planning is on the Postgres side; my data source only needs to know enough to answer simple index range queries with row tuples. I don't really have to do much thinking about SQL now that it's set up; I just needed to understand enough, once, to write the foreign data wrapper.

WorldMaker · on Oct 5, 2018

> a regular language

SQL varies somewhere between a context-free language and a context-sensitive language between dialects and/or edge cases.

Pedantry aside, certainly there are open source grammars for particular dialects. But it is the whole package of "drop in" library including at least some basic semantic modelling that doesn't exist, certainly not to the extent of say an XML or a GraphQL parser library.

> Personally, I would suggest reversing the whole problem formulation. You don't need to take your existing system and make it "speak SQL." Instead, it's much simpler (and, in the end, more performant) to take an existing, robust RDBMS and hollow it out into this "mini-version" that queries through to your service.

Which again, was complexity that I mentioned trying to avoid ("outside of an implementation tied to a specific SQL vendor"). You might not think it a big deal to tie your application to a single database vendor, but that's not always a good option. GraphQL as a language/platform is largely agnostic, supports a large variety of options for backend storage and makes it very easy to support heterogeneous data sources, because it is almost always as simple as "write a function to get the data" (and maybe "write a function to guess how expensive it would be to get the data", if you are optionally using a basic quota system). It does sound like Postgres Foreign Data Wrappers make some of that possible in that very specific database environment, but certainly it's a lot more complex of an option, which gets back to my root problem statement that "complexity" is most probably the reason SQL isn't getting used as much traction as an API option as for instance GraphQL.

derefr · on Oct 5, 2018

> "complexity" is most probably the reason SQL isn't getting used as much traction as an API option as for instance GraphQL.

There's a much more obvious reason: there's no such thing as an "ANSI SQL wire protocol." If you want to take advantage of an existing ecosystem of client libraries that speak SQL to your server (and why wouldn't you?), you literally have to choose a specific vendor's RDBMS wire protocol to mimic, and then tell your clients to configure their client library as if your service was an instance of that RDBMS.

Amazon's Aurora? You need to speak the MySQL wire protocol.

CockroachDB? It speaks the Postgres wire protocol.

If there was such a thing as an "ANSI SQL wire protocol"—and especially if there was such a thing as a "packed binary ANSI-SQL wire protocol"—then I think you'd see adoption of SQL as a lingua franca for things that are not, themselves, SQL RDBMSes take off. As it is, SQL can't be a lingua franca of anything–because there is no "SQL standard" in the sense of an IETF RFC networking standard you can just build your server around. "ANSI SQL" is a language standard (specifically, a mapping from a grammar to a semantics); not a protocol standard.

There isn't even an SQL-wire-protocol standardization committee. Not even a working group. Not even a random FOSS project to attempt to create a "reference ANSI-SQL protocol specification." It's just... not there. (It's kind of bizarre, actually.)

Volt · on Oct 5, 2018

But ODBC exists. Or am I missing something?

WorldMaker · on Oct 5, 2018

Would you use ODBC to communicate to a web API?

Also, ODBC is not quite the abstraction layer I think parent was referring to, because ODBC only (barely) abstracts getting SQL from a client to a server, it doesn't abstract any of the semantics of the communication because it entirely punts that to a "driver" model. ODBC is slightly better than RS-232, but it's about the same abstraction layer from what I've seen.