Hacker News new | past | comments | ask | show | jobs | submit login

The relational model is awesome. SQL is one of the worst languages I have ever encountered. The syntax is awkward and unwieldy. The only unit of abstraction in SQL the language is the VIEW. I end up with so many repeated patterns in SQL it's mind numbing.

I would love to see lisp-style (AST) macros applied to the relational model. Here's an example that might have an obvious solution (please share), but not off the top of my head. Find the id of every row having the minimum value per group by aggregation. I have to write a subquery to do this, but the transformation is quite regular and predictable. Where's my macro?




I believe some of the window functions[0] (specifically, row_number) will do what you want. Honestly though, these might not be a readability improvement over subqueries.

[0] https://www.postgresql.org/docs/11/functions-window.html

Edit: Just tested it, it works but I wouldn't call it pretty.

    create table testing(id int, grp int, other varchar2(255));
    
    insert into testing values (1, 1, '');
    insert into testing values (2, 1, '');
    insert into testing values (3, 1, '');
    insert into testing values (9, 2, '');
    insert into testing values (10, 2, '');
    
    
    with cte as (
      select grp, id, row_number() over (partition by grp order by id) as rn from testing
    )
    select grp, id from cte where rn = 1;


Another solution for PG which i've used many times:

    select distinct on (grp) grp, id
    from testing
    order by grp, id;


Neat! It doesn't work on Oracle, but you implied that it's PostgreSQL-specific so I somewhat expected that. It's a lot cleaner than mine and, even though it's not something I've seen before, it's fairly obvious what it's going to do.

When I did my testing I did it on Oracle since I have it available at work. For my non-work projects I use PostgreSQL and, for the most part, find that I can flip between the two database without too much trouble. Of course, this also means that I don't know about some PostgreSQL niceties like what you showed.


Very true, it's a nice PG specific extension of the SQL standard. Would be nice if some of these types of extensions made it back into the standard to increase the likelyhood other DB's you use would also support it.

It's syntactic sugar, but it is a whole lot more readable in some cases than a window function. It's nice to not need a sub query or cte to express it.


Holy shit. Thank you!


I think what paddy_m complains about is SQL’s tendency to require copy-paste programming.

For example, a phrase like row-number() over (partition by grp order by id) might return in many, many of your stored procedures, but there’s no way to only write it once in SQL.

The same thing effectively applies to that common table expression. There, you can create a view, and just write

  select grp, id from grouped_tests where rn = 1;
, but many SQL query planners won’t optimize queries that well across views, so you end up copy-pasting the CTE, or, worse, not even doing that because you have to write an even uglier query to get decent performance.


You can also express this example using the first_value window function:

   select distinct grp, first_value(id) over (partition by grp order by id) as rn from testing;


The only challenge here is that you can't bring out other fields from the table.

Building on my previous example:

    insert into testing values (8, 2, 'stuff');   
    select distinct grp, first_value(id) over (partition by grp order by id) as rn, other from testing;
This returns two rows for grp = 2, whereas adding the 'other' column to my first attempt (both in the CTE and the main query) works as expected and only returns one row for each grp.

This would work properly within the CTE, but then you're joining to the CTE instead of just querying it. There are many ways to solve this problem.

My ideal solution would be something like this that gets rid of the CTE:

    select grp, id, row_number() over (partition by grp order by id) as rn, other from testing where rn = 1
But that doesn't work. Oracle only allows window functions in the SELECT clause, not in the WHERE clause, plus you can't reference the column alias anyway.


Bear in mind that CTEs act as an optimization fence, so the filter (rn=1) wont be pushed to the CTE. The CTE will be materialized and only then the filter will be applied. I believe pg12 will change this behaviour.


That is only in postgres. The optimization fence is not part of the standard, was only created in postgres for simplification of implementation, and is likely to be removed soon (version 12).

https://www.depesz.com/2019/02/19/waiting-for-postgresql-12-...


I don't have time today to try to build it, but that challenge seems like a textbook usage of window partition.

https://www.postgresql.org/docs/9.1/tutorial-window.html


Hey, saurik, do you know there are plenty of people waiting for you to upgrade Substrate for A12? We know you are maybe too busy to upgrade substrate but can you tell people when that thing will come? We all are looking forward to seeing this happens. So can you reply me?


Thanks for the responses, looks like I have some digging to do. My point remains though that I don't know of a way to write user extensible syntax in SQL. I can get by fine in most languages without user extensible syntax (especially python), but the lacking feature really hurts in SQL.


This isn't ideal but SQL got a lot less mind-numbing for me when I got decently good at vim. Maybe SQL doesn't have macros but vim macros, dot, and column mode reduced my repetitive typing by a lot. Starting with a tool-generated list of column names often helped too.


The downside of that is that you end up with heaps and heaps of buggy queries and SQL code resulting in a maintenance nightmare.


Are you saying that about SQL in general? I end up with the same queries as if I'd typed out the SQL any other way.

Or are you saying that efficient editing makes people forget about writing clean, concise code? I don't find that to be the case.


There's other abstraction ideas like synonyms and functions in most engines.

select top(1) with ties

id

from

table

order by

row_number() over (order by value asc)

In a meeting but is that the gist?


Depending on the version, you might be served by using MIN_BY(x,y) or MAX_BY(x,y). Returns the min(max) value of x for group y.


Not sure I understand properly, but maybe this:

> select groupbycol, array_agg(id) from table group by groupbycol having avg(minavgcol) > T




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: