Hacker News new | past | comments | ask | show | jobs | submit login
How Postgres chooses which index to use for a query (pganalyze.com)
300 points by tosh on April 21, 2022 | hide | past | favorite | 76 comments



I love this kind of breakdown that not only explains how something works but walks you through the actual mechanics at the code level.

Apart from the explanation itself, it shows how nice the Postgresql code is. It's a joy to read through.

And specifically this topic is great to understand better because I have multiple times encountered queries that chose insane strategies that resulted in ridiculously long query times. Unfortunately the developers of Postgresql really seem to have a "you're holding it wrong" approach to this issue ... yes I know I'm doing something dumb if my query plan is doing that, but it would be really nice if Postgres gave better tools to figure out why so that I can do it better.

The explain plan tells you what plan it created, but it doesn't help you understand why it didn't do the one that you think it should. This is where hints (which Postgresql doesn't support) are so powerful because if I can force it to use the plan I am thinking of and then see what costs it estimated for each operation, I can figure out why it is rejecting it. I would love to know if there are any good tools for this!


The fun thing about databases is that they can sometimes spend more time figuring out what to do than actually executing the query. It's usually worth it - because doing the wrong thing is pretty expensive.

But it's nice to be able to use index hints when you understand and accept the danger. I've seen queries that take 30-40ms drop to < 1ms just by telling the DB which index to use, on multi terrabyte data sets with a few complicated indexes. The query always was fast, and picked the same index, but by just telling it which index to use you get an order of magnitude better performance.


Its was actually very surprising to me to find PG does not cache query plans but instead replans on every execution! I am used to db's like MSSQL server that cache and reuse plans making things like prepared statements redundant.

My understanding is the planner is so simple that it is deemed not necessary, unless of course it is then do a prepared statement which only works on one connection...

This would absolute trash performance on SQL server to not cache plans as its planner spends significant time optimizing.


> Its was actually very surprising to me to find PG does not cache query plans but instead replans on every execution!

If you used prepared statements, `plan_cache_mode` (since Postgres >=12) can be configured.

It defaults to auto, which will generate a generic plan after 5 (iirc) executions.

If set to `force_generic_plan`, it'll make a generic plan right away. If set to `force_custom_plan`, it'll always do planning. The latter can be useful for queries where the parameters of the prepared statement can have very different selectivities.


Prepared statements are only on a single connection (Session), its actually surprising that PG doesn't even cache prepared statements always!

In MS SQL server plans are always cached unless told not to this is across connections, if it sees the same sql statement its will skip planning, this goes way back. If you go back far enough it only did that for stored procedures so there was a push to do everything through sprocs for performance, not so much in the last 20 years. Using prepared statements is for backward compatibility as is has almost no performance advantage due to modern plan caching.

I think PG finally has sprocs but still no plan caching even for them unless somethings changed.

PG has no way to share plans between worker processes AKAIK. I believe the lack of plan caching is noticed more with heavy jit optimizing performance where PG will use LLVM now which has a relatively high cost.


One could argue, always caching plans could be costly resource wise and you need to administer timeouts and cache sizes, while when associated with a session, you get the cleanup for free.


How does it work when parameter value changes plan? Simple example: `where name like ?`. If parameter is 'abc%', then you can use index. If parameter is '%abc', you can't use index.


It's uses the index either way, specifically it still index seeks but ends up doing the same thing as an index/table scan although slower. It normally makes a plan based on first execution and uses parameter sniffing, so it might optimize for no index seek if the first request has wildcard in front.

You can pass it a hint OPTION (RECOMPILE) that will make it act like PG and make new plan every time, if its a query with wildly varying parameters that doesn't execute as often this can make sense, most of the time it doesn't due to planning overhead.

There is also the OPTIMIZE FOR hint that can have it make a plan optimized for the more common of the parameter values if you don't want it planning every execution regardless of value passed on first execute.


You can use a statment 'OPTION RECOMPILE' to force the query plan to be rebuilt with every execution. A similar thing can be done with stored procedures, which are also aggressively plan cached.


MSSQL plan cache is something you need to be aware of, it will work against you if statistics change but the plan remains the same. this is especially true if you start with a small amount of data, get a plan cached and then bulk load lots after testing. foot = shot


Normally AUTO_UPDATE_STATISTICS will take care of recompiling plans if statistics changing significantly.

I usually have the opposite problem with a good plan changing to a bad one due to statistics changing at some random time in production, which seems to be a common problem in PG as well and why everyone want hints.

I prefer hinting for critical problematic queries that tend to have unstable plans due to statistics changes, but you can also lock a plan in SQL server to force a specific plan, another useful feature to prevent production db instability.


Agree completely on the plan changing from good to bad after some kind of analyze going awry. Had this happen multiple times on mssql and postgres. Our workaround was to perform analyze periodically multiple times a day since we could afford it.


I did not know PG lacked a plan cache. I guess if the planner is fast enough trying to build cache keys from a query will just slow things down.


If the planner is faster than a cache lookup it makes me wonder how sophisticated it is and how much optimization is left on the table.

Most apps I deal with queries are very regular and the plan cache makes it act like a prepared statement improving performance 99% of the time. Once in a while it works against you and you need to give it a hint to recompile or optimize for specific parameter values.


Also for those curious here's the C++ query planner for Mongo: https://github.com/mongodb/mongo/blob/master/src/mongo/db/qu...

Checking whether the key pattern matches an index: https://github.com/mongodb/mongo/blob/master/src/mongo/db/qu...

I wish Mongo could use intersecting indexes for queries and sorting...


Yes especially when joining dozens of tables in a single query it seems planning is exponential based on number of tables involved and planning can start to take over 1 second


I primarily work with InnoDB in MySQL and I've seen it occasionally make some terrible query plans where I've had to specify things to make it perform well.


especially painful if the wrong thing gets jitted with full optimization... only to be not executed


Postgres really really needs index hints.


Probably. There are old legends about why they're not in, when it's really more a laundry list as to how they are perhaps somewhat less useful than people think. My personal opinion is that missing from the equation is that it's an invasive feature with a wide interface, and nobody in a position to do the work and credibly maintain it for a long period (this is a tiny number of people) has found it the optimal use of their time. You can contrast this with, say, the JIT feature, where it's a ton of technical work, but the interface is relatively thin.

I feel like the evolution in regarding hot standby was somewhat similar: people had objections, people pointed out weaknesses in this family of systems, people pointed at Slony or Skytools, but when someone cleared their schedule and looked like they'd account for bugs for a good while, it wasn't such a big fight in the end.


I would go further and say that there should be an alternate language that lets you just provide a planned path.

Just yesterday i spent time debugging a whole query thing cuz heuristics made the planner plan the wrong thing.

The planner is pretty amazing black magic that can just make life easy (just like compilers do a bunch of cool stuff.

But I think SQL perf would be a lot easier for people to grok if the low level involved specifying “hey, system, go through this btree, pull these items, merge with items from this other btree, etc”


Agreed. A while ago I looked into hacking something like this into plv8. You'd be able to do something like:

SELECT * FROM js_query('...');

Where the query is a JS snippet that has access to low level access methods, e.g.:

  var results = [];
  var cursor = myTable.indexes.foo.seek('item-23');
  while (cursor.next()) {
    results.push(cursor.row());
  }
  return results;
However, it turns out this would be a lot of work and it's difficult to make it work with features like row level security.

Since I didn't have enough time I dropped the idea, but it'd be awesome if somebody hacked this up at some point : ).


> "hey, system, go through this btree, pull these items, merge with items from this other btree, etc”

What I've always wanted in Postgres in one quote expect to also allow other index types (e.g. Hash/BRIN/etc) in its language.

Side benefit: It also allows dev's to understand the pro's/con's of these constructs when using each one and build up their knowledge on what is really just algo's and data structures.


Do any relational DBs let you lock statistics or the query planner directly? Something like how ML pipelines are split between training and inference phases. Let me train the optimizer on a representative load (prod replay or synthetic) and then lock the query planner for my production traffic. Still take advantage of the black magic but control it so query plans don't unexpectedly change.


Problem with databases is that they have the tendency to grow over time, so the statistics change all the time, and thus the optimal query plan.


Yes, some are able to verify that execution plan already exists (saved in gather run or by explicit command) before replaning. Also you can group them with one label and activate only when needed.

I guess that there must be a good number of people who would use it in PostgreSQL. Anybody analyzed previous tries to implement it?


Sybase 11 and 12 let you force a plan and declare what indexes to use including temp tables if they were declared in a stored procedure that was calling the stored procedure you used the indexes. The optimizer was not the best.


MSSQL mostly has this feature with Query Store.


That would be a huge compatibility nightmare any time you upgraded database versions.


Not really. The basic concepts of "tables/indexes" and "access paths" has been stable for years (decades really). So if you wrote something like: MY_TABLE.MY_INDEX.SEEK(whatever), that would have the same meaning as it did 30-40 years ago. Same for range scans, nested loop / merge / hash joins, sorting etc.


I agree, and viraptor came to my rescue recently:

https://news.ycombinator.com/item?id=31067059#31068194

> There's also an extension if you want just the hints: https://pghintplan.osdn.jp/pg_hint_plan.html


there is a maintained index hint extension: https://github.com/ossc-db/pg_hint_plan - at least as far as 13 (and likely 14).

if we're going to talk about index functionality that would be good and effective for Postgres, an index across all partitioned tables (both normal and unique) would be very much welcomed.

the problem is finding someone to maintain it for life.


I always assumed that an ideal query planner would be in the best position to decide what plan is best based if it had real-time access to up-to-date data statistics and available indexes.

Adding hints might hinder performance improvements when migrating to a newer version of a database, as the newer version might have a better way to execute your query than what the hint is suggesting. You'd have to revisit all your hints when upgrading.


Is there a good reason they haven't added them? Difficult to implement?


I know nothing of why postgres don't want to implement them, but coming from the MS SQL Server side of things I can tell you they are quite an effective foot-gun in a number of circumstances. More often, in fact, than they are a benefit.

Often people reach for index hints when they should be fixing issues in the query or the available indexes, swapping a scan for many seeks which on small data does give a benefit but once code goes into production and the data grows performance falls through the floor. Also if they are named in the hint then you open up a new family of errors if the index gets altered later (though you could perhaps implement the hint as “use an index on columns x & y” rather than “use this specific named index” to get around this).

There are circumstances where they are a genuinely useful tool, of course, but not nearly as many as people seem to think.

Beyond that I assume implementation will have difficult points. If the hints are taken as instructions it may require significant changes to the rest of the generated query plan, if they are sometimes ignored then people will complain bitterly…


> Poor application code maintainability: hints in queries require massive refactoring.

> Interference with upgrades: today's helpful hints become anti-performance after an upgrade.

> Encouraging bad DBA habits slap a hint on instead of figuring out the real issue. Does not scale with data size: the hint that's right when a table is small is likely to be wrong when it gets larger.

> Failure to actually improve query performance: most of the time, the optimizer is actually right.

> Interfering with improving the query planner: people who use hints seldom report the query problem to the project.

https://wiki.postgresql.org/wiki/OptimizerHintsDiscussion


There is no optimizer that is capable of getting the correct result 100% of the time. Even Oracle with decades of costly engineering behind it requires query tuning (either via explicit hints on the client side sql or via server side plan tuning) at any reasonably large scale.

I like Postgresql and use it when I can and I respect the Postgreql core dev team's adherence to principles but in this particular case I believe they are just inventing arguments to justify their unreasonable stance against allowing query hinting.


So "most of the time" means "100% of the time" where you live?


I mean, I think this is just one of those things where it only matters rarely, but when it does it matters a lot.


Maybe postgres isn't the right solution in those cases. If you need tight control over how your queries are executing you may not want an rdbms.


I don't think choices of things like this ever really come down to one thing like that. Not using an RDBMS might mean having to build a whole bunch of features you wouldn't otherwise have to.


That's possible I guess, but at least some other relational dbs do allow that control, so it's not like it's impossible.


The biggest reason - which doesn't seem to be mentioned - is consistency and stability.

I don't like being woken up to debug why the query planner has decided to do something new. Yes, there's always a "good reason" but does it need to communicate that by taking down production at 3am?

Even if it's a better plan, I don't need my anomaly detection alerts going off because something is now running in half the time...


This is why I prefer to page on features working within SLA instead of lower level things like CPU usage or query time.

Having a system send warnings about CPU or query time are great when I am awake. They are hints to a problem, but not always a problem themselves.


> Failure to actually improve query performance: most of the time, the optimizer is actually right.

my favorite response to someone complaining about a query doing a sequential scan is to tell them to `set enable_seqscan = false;` and rerun the query. usually, Postgres is right. it's rare that it is wrong, but it's nice to have a workaround for those few times that it's wrong - at least on fairly static data.


A common flaw to “just do x and rerun it” is that you are not running both in the same conditions: the data is in RAM second time around. I've seen people think that a change (an index hint, switching between a CTE and a sub-query, …) has made a massive difference because the last couple of times the query ran it timed out or took ages, when in fact the next run works and works faster not because of the change but because the previous run or two have primed the page cache with all the data that is needed, cutting out all the IO.

Test from both warm and cold states. Warm because most queries you are optimising are those that run often so usually do have some or all of the required data in RAM, and cold because sometimes they will run cold in production (and some queries just touch so much data, and/or cause spooling to disk of intermediate steps, so that IO will always be involved).

Ignore anyone who says always test warm because that is the most common state by far: if your query is reading far more than it really needs to when IO is involved, it will be using much more CPU than it needs to at other times and will harm concurrency potential (directly by eating CPU when other things could use it instead, or through holding more read locks than needed and holding them longer than necessary).


Planner is usually correct given the information it has, the table statistics are usually off and are causing problems.


You can't check complex cases with that.


complex cases aren't usually thrown over a wall with "why is this doing a sequential scan?" typically, in the cases someone is asking me that specific question, it's a very simple query.

complex cases, otoh, usually aren't approached with that much naivety.


> people who use hints seldom report the query problem to the project

This statement is shockingly user-hostile and paints the users overall as selfish and uninterested in the improvement of the postgresql project. Certainly some users will use their hint and go home, but you can't know that this would result in a net decrease in performance reports. Maybe with hints different users would be empowered to submit more reports like "The query planner creates a slow plan for query X, I know because when I use hint Z it's faster". Not implementing hints just to spite selfish users while ignoring the potential additional reports submitted by different caring users is shortsighted and antisocial.

Maybe there are a bunch of good reasons why hints should not be implemented, but this is not one of them.

My past commentary: https://news.ycombinator.com/item?id=29987139


seems to be a matter of principle with the (some?) core devs position that it adds complexity and shouldn't be necessary. I remember the big debate long ago that resulted in postgres rejecting the standard sql merge command in favor of a proprietary syntax for upserting despite the common practice of using merge w/hints for doing upserts in most other engines - iirc it boiled down to postgres devs do not want hints, ever.


> a proprietary syntax for upserting despite the common practice of using merge w/hints for doing upserts in most other engines

I was the primary author of the feature. That decision wasn't related to hints at all. MERGE just doesn't address the problem that I wanted to address.

I think that you must be referring to Oracle's ignore_row_on_dupkey_index. Is that really in widespread use? While it is technically a hint, it is unlike any other hint it at least one important way: it changes the semantics of the query!


> ignore_row_on_dupkey_index

I was thinking of the hints you used to have to use with a merge statement in order to get an atomic upsert - holdlock in sql server and lock table in oracle. My recollection of the terribly old newsgroup thread where it was (presumably you) said merge doesn't perform atomic upserts without these hints and since people were primarily wanting merge to perform atomic upserts and postgres doesn't support hints it made more sense to just make something for upserts hence the on conflict update stuff. Do I have it wrong?


> use with a merge statement in order to get an atomic upsert - holdlock in sql server

Because of all the potential edge cases that MERGE didn't handle properly when introduced to SQL Server (2008?), I've avoided its use. Despite several revisions over the years there are apparently still significant behaviours that I'd consider bugs¹, some of which have been closed as WONTFIX² so are a fact of life you just have to accept if you use MERGE in SQL Server.

[1] see https://sqlsunday.com/2021/05/04/how-merge-can-deadlock-you/ & https://michaeljswart.com/2021/08/what-to-avoid-if-you-want-... and many other similar articles

[2] https://www.mssqltips.com/sqlservertip/3074/use-caution-with...


I'm in an agree-to-disagree state with various sql server consultants on this matter. I disagree that merge is harmful and should be avoided. Using merge w/hints for upserts has a decent performance advantage over the alternatives and comes with some theoretical risk of deadlocks that depend on implementation details of your use case and/or the shape of your data (I've yet to encounter it in practice, I tend to batch upserts and limit concurrency and suspect the folks having trouble do the opposite)


The Oracle docs introduce MERGE as follows:

"Use the MERGE statement to select rows from one or more sources for update or insertion into a table or view. You can specify conditions to determine whether to update or insert into the target table or view.

This statement is a convenient way to combine multiple operations. It lets you avoid multiple INSERT, UPDATE, and DELETE DML statements."

MERGE doesn't claim to work as an atomic statement, and doesn't work as an atomic statement in practice. In other words you can get a unique violation error where a "true upsert" would update the row instead. You can work around this "limitation" in MERGE by retrying the statement until it succeeds, but at that point you might as well use a regular INSERT instead.

The syntax for INSERT .. ON CONFLICT constrains the problem in certain ways, which is quite deliberate. It's a pretty faithful representation of what's really going on. Hiding that creates a lot of subtle but real problems.


the docs also say "every statement is atomic" and the ambiguity around whether merge is one statement or two (or three) ought to be settled by explicitly stating our atomicity expectations with a transaction. anyway my experience has been that adding the lock hints "fixes" the problem with pk violations during concurrent upserts and it seemed to be commonly known and widely used afaict. fwiw I agree with not adopting "merge w/hints" as the syntax for upsert I just thought the decision had more to do with the aversion to hints but I stand corrected (also from reading some of the links here I see that even merge w/hints has a potential deadlock if you get unlucky with the range locks it uses under the hood - I never hit that in practice myself, but it seems clear that sql should have a standard more constrained upsert so that further supports your original decision)


> "every statement is atomic"

Statement-level atomicity doesn't mean that you cannot get race conditions like the one that I described - it just means that the statement succeeds or fails as a whole.

It's quite clear that you can get those race conditions in both Oracle and SQL Server's MERGE (same with the new Postgres implementation). Last I checked neither SQL Server nor Oracle address the fact that MERGE pretty much doesn't do what many of its users imagine it can do. (There is a narrow theoretical sense in which it does meet expectations provided serializable isolation level is used, but in practice that isn't worth much to users.)

> anyway my experience has been that adding the lock hints "fixes" the problem with pk violations during concurrent upserts and it seemed to be commonly known and widely used afaict

That's not my impression:

https://www.mssqltips.com/sqlservertip/3074/use-caution-with...


there it is, if I had a nickel for every time that link has been thrown at me.. twice in this thread alone ;) we are now in agree-to-disagree territory because my experience has been different and I don't like to leave performance on the table no matter what the consultants say (I suspect consultants are overexposed to the kind of code written by teams who ultimately need a consultant to save them - that has never been my team)


Of course they are now adding merge anyway: https://www.depesz.com/2022/03/31/waiting-for-postgresql-15-...



Did you read the article I posted, its being added to PG 15: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit...


Did you read the text that uhoh-itsmaciek posted?


From 2014. The MERGE discussion is from 2022.


I don't see anything in the 2022 blog post that contradicts the 2014 mailing list post. These are different features with different trade-offs and somewhat overlapping use cases.


Other systems like SQL server solve this with hints (HOLDLOCK) which is also discussed here that PG really needs hints.

So PG doesn't like hints and instead makes non standard syntax for upsert instead of standard merge syntax with a hint.

Now PG is adding merge anyway and probably could have just extended it with non standard syntax to get an atomic upsert instead it has two different approaches.

IMO it probably would have been better to go with the SQL standard MERGE with a non-standard hint syntax so its more readable between systems instead of adding non standard syntax then adding the standard much later.

Not sure if there is any movement on adding the upsert approach to the SQL standard instead.


to be fair sql server's holdlock is pretty specific to sql server, there's this note on pg's merge project page (https://wiki.postgresql.org/wiki/SQL_MERGE)

> PostgreSQL doesn't have a good way to lock access to a key value that doesn't exist yet--what other databases call key range locking (SQL Server for example). Improvements to the index implementation are needed to allow this feature.

as of 2017 at least the plan was to implement merge using the underlying insert on conflict mechanism and I think everyone agrees that the only reasonable expected behavior from merge is atomic upserts (and the pk violations from concurrent merges on sql server and oracle really should be recognized as bugs!) so maybe postgres merge will be the first non-buggy implementation w/o hints. reading the related threads exposes a surprising amount of complexity though so who knows.


> as of 2017 at least the plan was to implement merge using the underlying insert on conflict mechanism

That was something that was discussed at length at the time, but ultimately rejected. And so the implementation of MERGE in Postgres has essentially the same limitations as every other MERGE implementation that I'm familiar with.

Postgres clearly stresses the trade-off that MERGE makes in the docs, though. Which probably sets it apart.


afaik postgres still doesn't have a merge command (just tested on 14.2)?


It will be in Postgres 15, which will be out later in the year


Many databases are also used for executing dynamically generated SQL queries: ORM based, or BI/ETL generated queries. If you rely on hints too much, having good statistics is often neglected, and this will impact any query that cannot be provided with hints.


I've found that the planner generally does a good job with single-table restrictions. With joins, not so much. Sometimes simply turning a join of multiple tables into a CTE has forced the planner to be more efficient. Sometimes, counterintuitive, doing the join in the application is faster.

And yes, before resorting to any of those, I've spent time doing EXPLAINs and making sure I have the right indexes.


Simple constricting joins are usually okay, but if you have joins with "or" conditions on largish tables, that can easily throw it off completely. I find that I often need to rewrite those to use union.


Whenever I see query planning in databases, I immediately think that artificial intelligence should be able to do this pretty well. Which databases out there actually use some sort of AI to create a plan and execute?


Postgres does. :-) At least if genetic algorithms count as AI. If there are a lot of tables in the query, Postgres will switch to its GEQO planner [0]. The hardest thing about planners is they need to make a decision fast. It would be silly if they took longer than the marginal improvement over a worse plan. And since the possible ways to join tables grows exponentially, an exhaustive comparison quickly becomes impractical.

[0] https://www.postgresql.org/docs/current/geqo.html


Kind of unrelated ... but what's the pganalyze of MySQL?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: