PostgreSQL Parallel Query v2

malisper · on March 14, 2017

Besides parallel queries (specifically parallel index scans), the Postgres 10 feature I'm most looking forward to is the work Andres Freund has done on using LLVM to JIT compile expressions[0]. Based on the the layout of the database I work on[1], an insane amount of CPU goes to processing of partial index predicates. If the improvements are anywhere near what is being described in the mailing list, we're going to start saving a ton of money on CPU.

Parallel index scans will be an absolutely massive win for us. Currently Postgres does not perform any sort of prefetching. When performing an index scan, this results in serialized random I/O. In our case, queries wind up using ~1/10th the I/O they could be using if there were to do more I/O in parallel. We are expecting a dramatic speedup just from parallel index scans. This is in addition to the bugfix in 9.6 (we're currently on 9.5) which added proper support for index only scans over partial indexes. That bugfix should also result in a pretty large speedup for a lot of our queries.

As soon as Postgres 10 comes out, we are immediately going to upgrade. There are just so many huge wins for us.

[0] https://www.postgresql.org/message-id/20161206034955.bh33pae...

[1] https://blog.heapanalytics.com/running-10-million-postgresql...

anarazel · on March 15, 2017

Please share profiles on the list. Without known problematic production workloads, it's very hard for us to optimize appropriately. You mention predicate evaluation - it's possible this won't be affected by these changes (the index internals use lower level accesses than generic expression evaluation).

Re prefetching: PG does that for bitmap index scans, if you enable effective_io_concurrency > 1.

fnord123 · on March 14, 2017

It should be interesting, but LLVM jit brings in a substantial up front cost. Maybe Andres found a way to mitigate it.

anarazel · on March 15, 2017

The plan is to just use interpretation of expressions for short-running queries. I've spent a lot of work in this release cycle making that faster, and allowing for an easier integration w/ less duplication between JIT and interpreted evaluation. There's two approaches for deciding when to JIT: a) based on query cost estimation. b) just do it after a threshold number of executions. I've not yet decided / experimented enough to make a call.

Additionally, at a later stage, caching of compiled programs will play a role, to reduce the frequency of llvm invocation.

jordanthoms · on March 15, 2017

Is it significant when you are talking about analytical queries which are running for minutes or hours? Seems like it wouldn't be a huge amount of code to JIT compared to the payoff?

electrum · on March 16, 2017

Presto, which is designed for analytics, does JIT to bytecode on every query and still works fine on sub-second queries. One potential difference with LLVM is that the JVM will profile and only generate machine code for hot code, so short queries might use JVM interpreted bytecode.

But you are absolutely correct that for long running analytic queries, the up front cost of generating optimized code pays off.

colanderman · on March 14, 2017

Don't forget native logical replication.

malisper · on March 14, 2017

Logical replication doesn't matter as much to us. We are using CitusDB (in other words sharded Postgres), and we have our own code for handling replication.

qeternity · on March 15, 2017

Citus should already be saturating i/o during a scan unless you're only hitting one shard. What am I missing about your use case?

anarazel · on March 15, 2017

In many cases all your data is actually in memory, especially when you can distribute your workload across many machines. If all your data is in memory, you'll very often hit CPU bottlenecks before memory bandwidth issues. Partially because bandwidth, disregarding latency, is just quite high, partially because expression evaluation in postgres isn't that well optimized (yet).

malisper · on March 15, 2017

We shard our data by customer and individual queries touch only data for one customer. We do split a customer's data into multiple shards, but the number of shards we would need to split each customer into is not worth the overhead of maintaining that many shards.

manigandham · on March 15, 2017

Why? Is this a security issue? If you're sharing the same distributed table then sharding by customer is limiting the number of nodes/scans in parallel that can run.

malisper · on March 16, 2017

Having millions of shards is a bit of a pain ops wise. There is also some other costs to having lots of shards. For example, the current Citus distribution planner takes time linear in the number of total shards. This isn't an issue 99% of the time, but since we use a composite type as our distribution key, we have an extremely high constant overhead for planning time. I believe the current median Citus Master planning time is ~350ms. We've talked to Citus about this and have discussed possible solutions. Currently we are trying to purge the composite type from our database.

throwaway91111 · on March 15, 2017

Doesn't this inherently depend on the query? Shard-local i/o saturation affects the latency of the entire query if you're not expecting the i/o to the client to saturate.

qeternity · on March 15, 2017

This was exactly my point. But usually you're doing a full scan across all shards, so disk i/o is being saturated by virtue of the MPP sharding.

fleetside72 · on March 14, 2017

The relationship between EnterpriseDB and the at-large Postgres community has always intrigued me. Why do they turn over to the community what they could make part of their proprietary fork? Maybe they make a lot of money consulting and not just licenses and general at-large adoption is good for that side of business.

Tostino · on March 14, 2017

Part of it is that they build the basis for features, and work on getting them into core, so it makes their value added features in their proprietary fork easier to maintain. Sometimes giving back and having an easier time maintaining your own fork is worth the effort and costs associated I guess. I for one appreciate it.

mulmen · on March 14, 2017

I have never used EnterpriseDB and I am only aware of it from a distance but it has always struck me as the ideal open source arrangement. Sure, EnterpriseDB is never going to have the revenue numbers of Oracle but their management and business teams should have a much easier time sleeping at night. I think it is great that EnterpriseDB has identified that it is better for their company to exist as part of a healthy ecosystem instead of trying to pave paradise so to speak.

b34r · on March 14, 2017

A healthy relationship with open source projects is a great way to attract talent and gain goodwill.

manigandham · on March 15, 2017

> but their management and business teams should have a much easier time sleeping at night

Why? What is Oracle doing so wrong that they cant sleep as well at night?

pstuart · on March 14, 2017

I think their secret sauce is Oracle compatibility, so if Postgres runs better, they run better.

koolba · on March 14, 2017

And the Oracle compatibility piece breaks so much that it would never get merged into community Postgres.

pstuart · on March 14, 2017

Even if it didn't, for all the work they put into the community version one can't begrudge them trying to make a living on the enterprise side.

jeltz · on March 15, 2017

There are also plenty of devs in the community, outside the EnterpriseDB employees, who do do not want to pollute core PostgreSQL with alternative ways to do things it already can do just because of Oracle compatibility. Aliases have been added for ISO SQL compatibility, but adding features purely (i.e. features with zero value unless you are migrating from Oracle) for Oracle compatibility is not something core PostgreSQL does.

rodgerd · on March 14, 2017

Well, Red Hat have demonstrated you can make a lot of money by open sourcing things (including buying proprietary software like ManageIQ and oVirt and open sourcing it). Given that EDB has ex-RHAT folks involved in setting it up, I imagine they're trying to replay the model.

fleetside72 · on March 20, 2017

I don't see how Red Hat has open sourced anything, the direction they are taking things is proprietary, not open.

hendzen · on March 14, 2017

RedHat Q4 FY2016 profit: $68M

Oracle Q4 FY2016 profit: $2B

awj · on March 15, 2017

You do realize you're comparing an amalgamation of software companies to a Linux distribution vendor, right? Oracle hasn't just been a database company for ... probably decades.

kermatt · on March 15, 2017

Apples : Citrus

_qfi9 · on March 14, 2017

will oracle be around in 10-20 years? will red hat? which will experience growth?

temp246810 · on March 15, 2017

All things being equal it would take red hat ~30 quarters to make 2bn dollars, or roughly 7 years.

7 years to make what Oracle makes in a quarter.

I don't really have a dog in this fight but just FYI I think you missed ops point.

holri · on March 15, 2017

RedHat contributes to the common good, Oracle for a few millionaires. Therefore they create more value.

hendzen · on March 15, 2017

Does developing MySQL and Java somehow not contribute to the common good?

reitanqild · on March 15, 2017

FWIW my feeling is that Oracle has made a net negative contribution to the Java ecosystem.

pgaddict · on March 15, 2017

http://www.theregister.co.uk/2016/12/16/oracle_targets_java_...

_qfi9 · on March 15, 2017

I thought we didn't care about cash flows when valuing tech companies, just growth prospects ;)

manigandham · on March 15, 2017

This is a very common model now - several companies have open-source software (either their own or major contributor to) that is then covered with extra functionality, support and consulting services.

Some tradeoffs since the products are usually not as good and move slower than proprietary commercial offerings but the free access and large communities make up for it.

takeda · on March 15, 2017

They do, they for example provide fork of Postgres that is compatible with Oracle which is not free.

X86BSD · on March 15, 2017

Can't be for certain but one possible large factor is it reduces diffs from the upstream if you commit as much as you can back upstream. It's a major engineering drag to deal with that.

Tostino · on March 14, 2017

That is really an amazing amount of work that has gone into this next release.

This is really getting Postgres up to par with the commercial offerings, and hopefully it keeps progressing for v11.

It's especially important now since servers with more and more cores are becoming the norm.

anarazel · on March 15, 2017

Robert and his team are really doing an impressive amount of progress and work on this.

pgaddict · on March 15, 2017

Definitely agreed.