Postgresql 9.2 will have boat-loads of performance enhancements

FooBarWidget · on Nov 30, 2011

These are fantastic improvements. But I wish there is auto-sharding support for PostgreSQL, or at least an auto-sharding layer. For some data sets I really sharding and building things manually just takes too much time. Unfortunately sharding is not as easy as 'insert into server number object_id % NUMBER_OF_SERVERS'; a proper auto-sharding application should also handle moving data from one shard to another.

bad_user · on Nov 30, 2011

Adding or removing shards can get complicated and it is better to handle this at your application layer.

For instance, you could keep track of the configuration history - when adding a shard, you create a new version. And when trying to fetch, you could try the latest version of your configuration (which says, for example, that it's on server 3) and if not found you could try the previous version (which says that it's on server 1). You can also add the possibility of removing a shard, however this requires that for every write you'll have to do a secondary write to another server, for redundancy.

In the background you could have a healer that whenever a new version gets pushed, it would go through all entries and move them accordingly. And at the end it would declare such a version to be "healed".

However, what I described above works in case you want to introduce a new server in production immediately. Like say, in the case of something like Twitter, you want to write to the new shard right away and the most important messages are the most recent ones. But you may also want to create a configuration such that a new server isn't hit by the front-end until the healing script hasn't finished migrating.

The thing is, a layer like this gets to be very optimized for your needs and there's no one size fits all.

ianso · on Nov 30, 2011

I have a proof-of-concept that illustrates a way of auto-sharding a Postgres database using a patched version of PL/Proxy: http://code.google.com/p/hotrepart/

The big limitation is that your DB structure has to be hierarchical. I think a lower-level solution at the WAL level would be better.

wulczer · on Nov 30, 2011

There's talk about making partition management easier, but I doubt it's going to be present in 9.2

You can look at PL/Proxy that makes some things related to partitioning across multiple machines easier.

chrislomax · on Nov 30, 2011

I like all the talk surrounding Postgresql recently, does anyone have any experience with moving from MSSQL to Postgresql?

I am not asking if one is better than the other because it depends what I am using it for but what about performance / reliability?

masklinn · on Nov 30, 2011

> what about performance / reliability?

Postgres has very good reliability (for the longest time, reliability and completeness were the only two driving goals of the project). It also has acceptable-to-good performances (on unices anyway) but the default configuration is completely untuned.

This default is meant to run on very weak machines, so the db "works" everywhere OOTB, many people don't configure their postgres (and it's not been trivial historically, it's gotten a lot better) and end up with miserable performances.

This issue is most visible with "the other open-source database" (MySQL) as it has 1. a much more aggressive default configuration and 2. MyISAM default (I think it might have been switched now, but that's pretty recent), which has very fast reads (and nothing else)

An other issue with postgres is probably the lack of postgres DBAs (let alone good ones) compared to Oracle and MSSQL, though any good and open-minded DBA should be able to get up-to-speed quick on it. The last issue would be tooling, I believe MS provides lots of administration and GUI tools for MSSQL, there's a dearth of those for postgres.

grey · on Dec 1, 2011

There's a fair number of really good Postgres tools out there, or tools that support Postgres among other platforms, They're just pricey. The free tools are passable but not that good.

jeltz · on Dec 1, 2011

For me personally PostgreSQL makes up for the lack of good free graphical tools by providing an excellent command line client, psql, and good utility views and functions. I much prefer it to the CLIs of MySQL, Oracle and SQLite.

bad_user · on Nov 30, 2011

I do not have experience with MSSQL, but I have been using PostgreSQL since 5 years ago.

It's a really reliable database, more reliable than MySQL. It has all the features you expect a real RDBMS to have.

It has good performance too. There's this myth going around, with people saying that it doesn't have good performance. However it is just a myth, started because the default configuration is optimized for poor servers without much memory. Make sure to optimize your configuration (i.e. specifying how much memory can PostgreSQL use for certain tasks - you can find guides for that online).

bmurphy · on Nov 30, 2011

I used MSSQL up to MSSQL 2005 and MySQL up until about the same time. Around then I switched over to PostgreSQL completely and have not had a reason to go back. My biggest complaints had to do with simple replication and that was mostly solved in Postgres 9.0. Postgres is a damn fine database.

chrislomax · on Nov 30, 2011

Well I am quite honestly bored of paying £240 a month per cpu licensing costs.

I have no objections for paying for licenses but it's just getting too much. They put their costs up recently from £180 a month to £240, when we tried to pass that onto a client they pulled their face, it's getting hard to justify the cost of running a web server on a Microsoft platform

hkarthik · on Nov 30, 2011

Such comparisons are often difficult since MySQL and Postgres typical perform significantly better on Unix based systems than they do on Windows. Also using the Windows/.NET libraries for connections tend to perform worse even if the database server itself is hosted a Unix box.

In general, I would say you should stay on MSSQL if your server environments are Windows and only look at the OSS alternatives if you're willing (and able) to make a full stack switch to Unix.

durzagott · on Nov 30, 2011

At the company I work for we are currently in the process of migrating from MSSQL to PostgreSQL. However, we are trickling to a full stack switch to Linux. In the meantime, the Postgres databases have performed extremely well and on balance we are happy with it.

Unfortunately I would argue that at an enterprise level MSSQL is better than Postgres (pure conjecture here). If nothing else, it is way easier to employ MSSQL experts where I live and next to impossible to find Postgres talent. This pretty much means that we are learning by trial and error what the best practices are.

hkarthik · on Nov 30, 2011

Great to hear that your company is forward thinking and willing to make the considerable investment to get off the MS Stack.

You might have more luck finding some Oracle DBAs and training them up on PostGres. There's a lot of similarities between the two, and PG/SQL is also very similar to PL/SQL.

nodata · on Nov 30, 2011

If you need remote Postgres talent, go to the source: www.EnterpriseDB.com

wulczer · on Nov 30, 2011

Actually, there's quite a few companies that offer professional PG support, there's a list on the official PostgreSQL page:

http://www.postgresql.org/support/professional_support/

chrislomax · on Nov 30, 2011

Well we do currently use .net but I asked a question the other week on here about different languages etc and one good piece of advice I got was to create a REST service and interact with that.

We have put a lot of thought into it and we were going to give it a go with WCF and REST but I don't see any issue why we cannot buy a UNIX box and put Postgresql on it and put a layer on it with some other language serving the REST?

I will be benchmarking this with a current customer who has over 1 million products and a lot of visitors a day to see how it stands up against MSSQL, we are quite sensible into not jumping into decisions so I appreciate your comment and it shall definitely be part of the benchmark.

dos1 · on Nov 30, 2011

Having spent the better part of 3 years creating REST services with .NET, I strongly recommend you do NOT use traditional WCF. The WCF Web API stuff looks interesting but still seems immature. I've found ASP.NET MVC to be the best tool for creating REST apis.

hkarthik · on Nov 30, 2011

Even MVC is a little heavy and you don't get a lot of things for free with it. Compare it to some of the Java OSS projects like Enunciate and it still requires a lot of extra work.

On the .NET side, I would suggest looking at Demis Ballot's excellent Service Stack library. http://www.servicestack.net/

chrislomax · on Nov 30, 2011

Thanks for the advice

I have found the routing in MVC and their returns types (JSON etc) working really well to be honest, I will take a look at that

Thanks

whalesalad · on Nov 30, 2011

I know a lot of this will be up to the architecture and specific nature of each app... but has anyone recently compared a modern MySQL and Postgresql installation on a few baseline metrics? I'd love to see the differences. I am starting to use it more and more, being a Django developer surrounded by Postgres users. But, so far the core differences are scaring me away. psql template1? It doesn't make as much sense to me.

obtu · on Nov 30, 2011

Why connect to template1? It should be invisible. Create a user and a database for your webapp. Matching the unix account name is the simplest way to do that.

In pg_hba.conf, let unix users authenticate as matching PostgreSQL users:

    local all all peer

The postgres user on Debian/Ubuntu can create users and databases:

    sudo -u postgres createuser -SRD webappuser
    sudo -u postgres createdb webappuser --owner=webappuser

    sudo -u webappuser psql

masklinn · on Nov 30, 2011

> psql template1? It doesn't make as much sense to me.

What doesn't make sense about it? It opens a postgres shell to the `template1` database, that's all...

And there should be very little reason to connect to `template1`: it is the default template for creating a database (it's copied to the new db file basically), so the only point in connecting to it is e.g. creating a few common tables or enabling extensions so that all the databases you create have them by default.

I've yet to ever need that.

ryanbraganza · on Nov 30, 2011

What's the equivalent of template1 in mysql?

masklinn · on Nov 30, 2011

I don't think there is one, mysql only creates empty databases it does not have a concept of db templates.

bostonaholic · on Nov 30, 2011

Wait, is a boat-load more or less than a butt-load?

DennisP · on Nov 30, 2011

Boats are bigger than butts.