Why is everyone pushing CTE's for "readability" for Postgres queries? I've seen this come up multiple times in the past few months.
CTE's (WITH clauses) are optimisation barriers — the query planner in Postgres won't inline CTEs, but instead will materialise their result into a temporary table first, preventing qual (WHERE clause) push-down and other optimisations.
I know this because it bit me, hard — I went nuts and wrote a lot of the queries in a new system to use WITH clauses, for readability. Performance was atrocious, and required refactoring to inline all the WITH clauses as sub-SELECTS.
CTE's are useful, but should not be used to improve readability.
CTE's should be used to improve readability, they shouldn't be used if performance is a concern. There are many queries where the difference will be negligible if even noticeable. We should write code for humans first.
That being said, does anyone know if there are any plans to address them being optimisation barriers, or what the challenges are? Seems like even a naive rewrite into subselects before even creating the plan would reap a lot of benefits.
.. which indicates that reliance on them being optimization fences may be too ingrained to change. I had assumed that they were treated as sub-selects (given the generally advanced optimizer that PostgreSQL has), and was quite surprised to realize that they aren't. I think this is a major shortcoming of PostgreSQL, especially given the readability advantages of CTEs makes them quite attractive to use (I usually recommend heavy use of CTEs to beginning SQL-ers).
I'm pretty sure that the optimisation barrier is a 'feature'. This can actually be handy, especially when replacing a complex subquery that the query planner turns into a nested loop.
I've used a CTE before specifically for its optimisation barrier. The query planner wasn't generating the query I knew would be faster, so I re-jiggered it a bit with a CTE and it dropped from 100ms to 1ms, as I expected. I know there's ways to fiddle with the query planner but sometimes it's just easier to tell postgres: "No, do it this way".
I'm changing jobs and finally getting to switch to postgres for my day job from mssql for years and years - things like array_agg, jsonb and range types have me salivating.
I wish someone would do the advanced Postgres version of the standard RDBMS "library" example as a book, going beyond just joins and simple window functions to arrays, json, notifications etc.
Greg Smith, the author is a former colleague and friend. His book on the performance side is definitely one of the better ones out there, but it's a few years old now and definitely more about tuning disks and storage more so than advanced usage. Sadly, I don't have a better one to recommend for the advanced user.
Definitely there's a market for postgresql books. Specifically, books with laddering type of learning experience (e.g.: beginner, intermediate, administration, developer, etc).
My favorite feature is explain (analyze, buffers, verbose). It took some work to learn how to understand it, but I don't think anything else had such a large payoff.
I also like the information and catalog schemas. Not something I use all the time, but all of the tables in each schema are worth exploring and seeing what information can be found. When they are needed, they can save a lot of time.
Call me crazy, but I also like using PL/pgSQL. A helpful function I have takes two table or view name arguments and compares if they are the same result set. Total life-saver and a major time boost for redesigning queries and checking to make sure table transfers won't cause corruption.
I love the error messages. They are so clear and always correct. I've found myself reading them and thinking "that can't be right" and sure enough, I was wrong.
Configuration is a breeze. The hba and conf files are fully documented and simple to understand. The location of the files are sensible and easy to find. Seeing how other database systems deal with configuration is eye-opening to say the least.
Last but not least, the documentation is just incredible. I've been asked about good PostgreSQL books, and I always say "read the docs; you'll see why there are few books around."
I especially second the "arrays" advice. Arrays of some datatypes (notably integers) have GIN and GiST index support for membership tests. This makes them good for denormalizing many-valued joins for fast lookups.
SequelPro is incredible - easy to use, beautiful, reliable, powerful - but it only works with MySQL (or MariaDB). I've tried Toad but it's very buggy and ugly. The AppStore has SQLPro for Postgres, but I've not heard any reviews, so I'm reticent to spend $30. I've used a Navicat trial for MySQL and it was good but costs around $200.
+1 for PGAdmin. I've been using it for several weeks and it's been pretty good, although it does have a couple of flaws. One of them appears to be an OSX issue, though, where opening tables sometimes throws part of the window (the top bar) outside of the screen (and the only way I've found to fix it is to click on Window > Zoom).
I recently bought a license to Postico (https://eggerapps.at/postico/) which is the successor to PG Commander. Switched from pgAdmin which is functional, but Postico's UI is nicer.
JetBrains' DataGrip (or any of their IDE's if you have them) is really nice, I've been using it in tandem with PgAdmin (for general maintenance tasks since it has a GUI for some common stuff like permissions).
I'm experimenting with https://eggerapps.at/postico/ currently and it's pretty nice. Fairly similar to Sequel Pro in a lot of ways (which I miss too when in Postgres-world).
I recommend and use sqlworkbench, which is great as long as you don't work for one of the govts disallowed by the license (which is apache otherwise) http://www.sql-workbench.net/
Are you using \x or \x auto? \x auto will automatically format it to the size of your screen for when things wrap or don't. I've found it's more often than not what I actually want.
Which ones specifically, im sure commenters here could help. ETL = Extract Transform Load, a common pattern when transferring data from one system to another. GIS = Geographical Information System, having to do with positions in space, generally lat/lng and mapping concerns.
CTE's (WITH clauses) are optimisation barriers — the query planner in Postgres won't inline CTEs, but instead will materialise their result into a temporary table first, preventing qual (WHERE clause) push-down and other optimisations.
I know this because it bit me, hard — I went nuts and wrote a lot of the queries in a new system to use WITH clauses, for readability. Performance was atrocious, and required refactoring to inline all the WITH clauses as sub-SELECTS.
CTE's are useful, but should not be used to improve readability.