I must admit, I don't like the the use of floats/decimals/rationals in this way....

gnosek · on March 21, 2018

Two ways to do this with window functions (there's probably a better one but this is what I came up with quickly):

1. Subquery

    SELECT * FROM (
        SELECT u.user, i.text, row_number() OVER (PARTITION BY i.user ORDER BY i.pos) AS row
        FROM users u INNER JOIN items i USING(user)
    ) numbered
    WHERE row <= 3;

2. CTE

    WITH numbered AS (
      SELECT u.user, i.text, row_number() OVER (PARTITION BY i.user ORDER BY i.pos) AS row
      FROM users u INNER JOIN items i USING(user)
    )
    SELECT * FROM numbered WHERE row <= 3;

Same thing, really. CTE looks a bit clearer, subquery seems to generate a faster plan but I don't really have a large enough dataset handy for it to make a difference.

jashmatthews · on March 21, 2018

They are fine for sorting! You just need to think about things differently.

  SELECT user, text FROM (
    SELECT u.user, text, row_number() OVER (PARTITION BY user ORDER BY rank) as rank
    FROM users u
    LEFT JOIN items using(user)
  ) ranked_items
  WHERE rank < 4

This option works for unlimited top-n without creating a bunch of unnecessary columns in the response.

trinitry3 · on March 21, 2018

> Say I wanted to present a list of users with their top three to-do items, that's impossible using the above method.

Not really:

    SELECT u.user, i.text
    FROM users u,
    LATERAL (SELECT * FROM items WHERE items.user = u.user ORDER BY items.pos DESC LIMIT 3) i;

(You can also make these columns instead of rows, but as someone already pointed out, you'd usually not do that in SQL.)

Svip · on March 21, 2018

You are not wrong, although your example would not work in every SQL dialect (even when rewritten). Sybase's T-SQL does not have equivalent for LIMIT.[0]

My point being, there are some issues with regards to data purity and reasoning, particularly if you want a system where users can build custom reports.

And I just wish the original article highlighted that using floats would have these issues, and if writing SQL like this is important to you, then you might want to reconsider.

(As for your note in brackets, where I work, SQL is basically used as a scripting language. So we would do my example in SQL.)

[0] Yes, I know there is 'set rowcount', but you cannot do that within a sub select.

smnc · on March 21, 2018

> Sybase's T-SQL does not have equivalent for LIMIT.[0]

Are you looking for SELECT TOP 3 ...?

Svip · on March 21, 2018

First available from 12.5.3. And I cannot be certain we have that version or a newer running.

But you are right, I should have clarified that.

grzm · on March 21, 2018

> "I don't like the the use of floats/decimals/rationals in this way. They are unfit for sorting."

Would you elaborate on why you consider these types unfit for sorting? Their orderings are deterministic, and efficiency depends on the implementation of the indexes for those specific types, not inherent in the data types themselves.

> "select u.user, item1=i1.text, item2=i2.text, item3=i3.text"

It would be more natural in SQL to return three rows than return three columns.

    SELECT u.user, i.text
      FROM users AS u
      JOIN items AS i USING (user)
      WHERE u.user = ?
      ORDER BY i.pos ASC LIMIT 3;

Why would you prefer to do it the way you describe?

Svip · on March 21, 2018

> Would you elaborate on why you consider these types unfit for sorting?

I should have changed that language, what I meant was that they were unfit for reasoning about your data from an isolated point of view, i.e. standing with just one row. To understand an item's position using floats, you must know the entire context.

> It would be more natural in SQL to return three rows than return three columns.

Look at my select again, yours have a specific user in mind, mine doesn't. I want a list of all users who have at least one to-do item. And I want to see their top three to-do items in the report.

Without doing sub-selects in the joins, there is no way to limit this to three when the position is stored as float. And even your sub-selects are going to be complicated.

It might be better to create a temporary table first with each user and their first three items, but that's hardly efficient.

> Why would you prefer to do it the way you describe?

Because I need an overview. And I don't want up to three rows per user, particularly because some users will only get one row, some only two and most three rows. That creates an inconsistent overview.

(And that's assuming you write some SQL to properly limit the joins to three rows.)

grzm · on March 21, 2018

Gotcha. Thanks for clarifying. You're right that you're effectively encoding the position, and that's part of the trade off. Edit to add: As for knowing the position by looking at it, I'd argue that most use cases, one wants to know the positions relative to others, which means you want to know them in context, next to the other rows. If you want to know the absolute numbers, you can use rank or row_number to add it: perhaps even provide a view if this is something you do frequently.

As for the report query you describe, you can get at the data you want with windowing functions. There are strategies for performing the pivot as well, but I'd likely do that in application code rather than in the SQL query.

cthor · on March 21, 2018

SELECT * FROM list ORDER BY pos LIMIT 3

If there's an index on pos, there's no cost to the sort.

Svip · on March 21, 2018

Sorry, but that's just one user. Assume we have 1000 users, 800 of which have at least one item in the `items` table.

Now I want a full select of those 800 users with their first three to-do items in neat columns next to the user.

Your select doesn't do that.

cthor · on March 21, 2018

Ah, ok, I misread. Yeah, that requires a subquery then, but not because of floats being unsuitabe for sorting, but because you're doing a 1-to-many thingy in a 1-to-1 way by hardcoding 3 columns. It's like writing

   items[0].text + items[1].text + items[2].text

In code to simplify it, instead of

   items.slice(0,2).map((e) => e.text).concat('');

The former mostly works, but it's definitely not elegant. And you have to be sure the number of items being 3 is set in stone.