Hacker News new | past | comments | ask | show | jobs | submit login

I'm not sure if it's as much a bottleneck in Postgres as it is in MySQL, but I've just written a tome[1] on more effective offset/limit pagination by using something called a deferred join.

The main problem with offset / limit pagination is that the database is forced to fetch and discard more and more rows as you get deeper into the pages.

The trick to getting offset / limit to be faster is to reduce the amount of data that the database has to inspect.

Using a "deferred join" defers the expensive data access until after the pagination has been calculated.

So instead of

  select * from contacts order by updated_at desc limit 15 offset 10000
you'd do

  select * from contacts inner join (
    select id from contacts order by updated_at desc limit 15 offset 10000
  ) as tmp using(id)
  order by updated_at desc
That way the inner query can use a covering index to pull everything it needs (in some cases) and the expensive selecting is done on fewer rows.

This works especially well when you intend to select a lot of data and when you're filtering / sorting on more than just the primary key.

[1] https://aaronfrancis.com/2022/efficient-pagination-using-def...




I'm guessing this is due to MySQL using clustered indexes, which require a separate b-tree lookup for each matching row in the secondary index since all you have is the key. A Postgres secondary index actually contains the offset of the slotted page containing the row.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: