The relational model is awesome. SQL is one of the worst languages I have ever encountered. The syntax is awkward and unwieldy. The only unit of abstraction in SQL the language is the VIEW. I end up with so many repeated patterns in SQL it's mind numbing.
I would love to see lisp-style (AST) macros applied to the relational model. Here's an example that might have an obvious solution (please share), but not off the top of my head. Find the id of every row having the minimum value per group by aggregation. I have to write a subquery to do this, but the transformation is quite regular and predictable. Where's my macro?
I believe some of the window functions[0] (specifically, row_number) will do what you want. Honestly though, these might not be a readability improvement over subqueries.
Edit: Just tested it, it works but I wouldn't call it pretty.
create table testing(id int, grp int, other varchar2(255));
insert into testing values (1, 1, '');
insert into testing values (2, 1, '');
insert into testing values (3, 1, '');
insert into testing values (9, 2, '');
insert into testing values (10, 2, '');
with cte as (
select grp, id, row_number() over (partition by grp order by id) as rn from testing
)
select grp, id from cte where rn = 1;
Neat! It doesn't work on Oracle, but you implied that it's PostgreSQL-specific so I somewhat expected that. It's a lot cleaner than mine and, even though it's not something I've seen before, it's fairly obvious what it's going to do.
When I did my testing I did it on Oracle since I have it available at work. For my non-work projects I use PostgreSQL and, for the most part, find that I can flip between the two database without too much trouble. Of course, this also means that I don't know about some PostgreSQL niceties like what you showed.
Very true, it's a nice PG specific extension of the SQL standard. Would be nice if some of these types of extensions made it back into the standard to increase the likelyhood other DB's you use would also support it.
It's syntactic sugar, but it is a whole lot more readable in some cases than a window function. It's nice to not need a sub query or cte to express it.
I think what paddy_m complains about is SQL’s tendency to require copy-paste programming.
For example, a phrase like row-number() over (partition by grp order by id) might return in many, many of your stored procedures, but there’s no way to only write it once in SQL.
The same thing effectively applies to that common table expression. There, you can create a view, and just write
select grp, id from grouped_tests where rn = 1;
, but many SQL query planners won’t optimize queries that well across views, so you end up copy-pasting the CTE, or, worse, not even doing that because you have to write an even uglier query to get decent performance.
The only challenge here is that you can't bring out other fields from the table.
Building on my previous example:
insert into testing values (8, 2, 'stuff');
select distinct grp, first_value(id) over (partition by grp order by id) as rn, other from testing;
This returns two rows for grp = 2, whereas adding the 'other' column to my first attempt (both in the CTE and the main query) works as expected and only returns one row for each grp.
This would work properly within the CTE, but then you're joining to the CTE instead of just querying it. There are many ways to solve this problem.
My ideal solution would be something like this that gets rid of the CTE:
select grp, id, row_number() over (partition by grp order by id) as rn, other from testing where rn = 1
But that doesn't work. Oracle only allows window functions in the SELECT clause, not in the WHERE clause, plus you can't reference the column alias anyway.
Bear in mind that CTEs act as an optimization fence, so the filter (rn=1) wont be pushed to the CTE. The CTE will be materialized and only then the filter will be applied. I believe pg12 will change this behaviour.
That is only in postgres. The optimization fence is not part of the standard, was only created in postgres for simplification of implementation, and is likely to be removed soon (version 12).
Hey, saurik, do you know there are plenty of people waiting for you to upgrade Substrate for A12?
We know you are maybe too busy to upgrade substrate but can you tell people when that thing will come? We all are looking forward to seeing this happens. So can you reply me?
Thanks for the responses, looks like I have some digging to do. My point remains though that I don't know of a way to write user extensible syntax in SQL. I can get by fine in most languages without user extensible syntax (especially python), but the lacking feature really hurts in SQL.
This isn't ideal but SQL got a lot less mind-numbing for me when I got decently good at vim. Maybe SQL doesn't have macros but vim macros, dot, and column mode reduced my repetitive typing by a lot. Starting with a tool-generated list of column names often helped too.
I would love to see lisp-style (AST) macros applied to the relational model. Here's an example that might have an obvious solution (please share), but not off the top of my head. Find the id of every row having the minimum value per group by aggregation. I have to write a subquery to do this, but the transformation is quite regular and predictable. Where's my macro?