Hacker News new | past | comments | ask | show | jobs | submit login
Examining PostgreSQL 9.4 – A first look (craigkerstiens.com)
162 points by kermatt on Feb 2, 2014 | hide | past | favorite | 31 comments



My favorite merged 9.4 features missing from this post:

* GIN optimizations for smaller index size and faster search. This should mean faster full text searching

* Time delayed replication standbys.

* CHECK OPTION for auto updatable views, I believe the constraints were slightly relaxed on which views can be updatable too.

* Replication slots looks promising, this should make it easy to setup a replication slave which does not have to be reinitialized on long network failures. No more WAL shipping needed.

* Printing the planning time in the EXPLAIN output (only since I wrote it myself).


> Time delayed replication standbys

Indeed, huge. Mysql has a relatively mature lag replica history now albeit via Percona. Adding delayed replicas to Postgres will be fantastic.


In what situations would a delayed replica be useful?


> In what situations would a delayed replica be useful?

When we start to have slightly larger databases (multi terabyte) then delayed replicas become a very useful backup and recovery strategy. Typically we need to recover databases when there has either been human error (someone deletes a bunch of data they should not have) or a compromise. In these situations rather than restoring for a nightly backup it is _always_ faster to roll a delayed replica forward and promote it to be master then resync all the normal replicas.


Good article. For generally following what's coming in Postgres, I find that Hubert "depesz" Lubaczewski's blog [1] is excellent; he has a whole series called "Waiting for [version]" that covers upcoming features, and he always includes examples of usage. For example, he has two pretty extensive articles on pg_prewarm [2] and the ordered-set features [3].

[1] http://www.depesz.com/tag/postgresql/

[2] http://www.depesz.com/2014/01/10/waiting-for-9-4-pg_prewarm-...

[3] http://www.depesz.com/2014/01/11/waiting-for-9-4-support-ord...


Agreed, despez is a great source as is Michael Paquier's blog (http://michael.otacoo.com/). The goal here was to boil it up a bit more to the practical uses and benefits versus the technical details of it. Further there was a talk at FOSDEM just yesterday which highlighted much of whats committed in 9.4, though slides aren't online yet.


nitpick: it's "depesz", not "despez" (pron. depe'sh)


or http://planet.postgresql.org , I found it worth following.


Wow. That would be great to have UPSERTS, as mentioned at the end of the post.

I've been lugging my RULES/Trigger functions for this feature for years.


> That would be great to have UPSERTS

Indeed. Postgres has some amazing features, and yet something so basic has been missing for years. I can't wait for a simple UPSERT, especially that most workarounds work correctly only with 1-row inserts.


In 9.1 and higher, the writable cte's give you multi-row upserts. It isn't particularly simple though.


Yeah, this feature would make me incredibly happy. With the current program I'm writing, I swear UPSERTS would cut a good third of the code out.


There has been a lot of discussion about which way UPSERT should be implemented so I remain cautious about if it will land in 9.4.


auto-update on materialized views! one step closer towards an OLAP solution.


Concurrent refresh for materialized views is also pretty cool!


where did you read that they auto-update now?


Wherever it was it is false, the only thing added for materialized views in 9.4 is concurrent manual refresh. In 9.3 refreshing locked the materialized view.


Concurrent materialized view refresh and upserts would both be huge in my opinion.


Materialized views "Didn’t auto-refresh"? Maybe I'm dumb or not getting it, but if it doesn't automatically refresh (it should be straight up transactional), then what's the benefit over a normal table? Just a perf gain for queries you can't modify (but can have non-transactional/stale data)?


I'm using them. Without them, I'd have to periodically load my main table and, in the same process, insert data derived from main table to summary tables. Now I can just setup materialized views and after I'm done loading data into the main table, just 'refresh' and expect all materialized views will to be synced. Somehow, it feels like a much lighter cognitive load on my brain.


Will we see support for partial updates of json types? Right now, if one needs to update a json column, you always have to overwrite with the full json data.


Not that I'm currently aware of, though if JSONB gets a little committed it will give a better foundation for it.


The nested hstore patch has replace(hstore,text[],hstore) which would do this but I see nothing similar in the jsonb patch. Perhaps it should be added.


When is proper inheritance being done?


Table inheritance already exists, the problem is that its generally not a great idea for databases. I believe the general consensus from much of the community is they wish Postgres didn't have it. I don't foresee many improvements to it in the near term, but also very hard to predict what will come out of it as if someone wants to scratch that itch it has a chance of getting in.


What is wrong with table inheritance, given it makes for a good partitioning mechanism? Is it being misused in other areas?


I actually like table inheritance. However there are all kinds of quite annoying cases about it and it gets misused in many cases.

What it works for really well

Table inheritance works really, really well for enforcing consistent interfaces to repeatedly used pieces of information which are independent for referential integrity purposes. For example, we've all seen horrors involving global notes tables with umpteen join tables.... Inheritance provides a very clean solution to that problem: have an abstract notes table (which can double for query purposes as a global notes table) and worker tables which have foreign keys which attach specifically to other tables.

For example, in LedgerSMB we have a note table, an invoice_note table a eca_note table (notes for customer/vendor agreements), and more. The nice thing is, the tables all have the same structure and can be managed structure-wise as if they were a single table.

For example, an alter table statement on note can affect all sub-tables in many cases (other than unique constraints, primary or foreign keys, etc). If I want to add a virtual column for full text searching, I can do this with a single function as follows:

    CREATE OR REPLACE FUNCTION tsvector(note) 
    RETURNS tsvector
    language sql immutable as
    $$ SELECT to_tsvector($1.subject || ' ' || $1.note); $$;
Then eca_note.tsvector will just work (subject to limitations of this syntactic feature of PostgreSQL). I could even index the output of the function on any note subtable.

What It Does Not Work For

So inheritance is often sold as a way of tracking part/whole relationships and other type/subtype problems. The problem is that these currently break down where you need referential integrity enforcement across an inheritance tree. Consequently, while I see inheritance as a really, really useful feature, it is a feature that is largely useless for the problems it was originally intended to solve. Now, it is getting better (9.2 added NOINHERIT constraints, which allow you to apply different check constraints to parent and child tables, useful if you want to forbid all inserts to the parent table), but the really big problems have to do with the inability to properly inherit unique indexes, and therefore not to referential integrity enforcement against a whole inheritance tree.

In general in these cases, you are better off with a single table, and designing a structure without table inheritance to model the information.


By this I assume you mean "inherits primary key/foreign keys properly?"

The problem I think is that this is likely to force some significant redesign of indexes. It isn't a trivial problem to solve.


Is bi-directional replication going to be included in 9.4?


No, but some parts of the underlying infrastructure have been merged. The only part of this work which is useful by itself is the replication slots which can be used instead of wal_keep_segments to make sure WAL is kept on the master until it has been received by the slaves.


Very unlikely.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: