I would recommend using pgloader (http://pgloader.io/) instead. It is simple to use in the default case, but can be configured with rules for how data types should be converted if necessary. It can also clean up weird timestamps like 0000-00-00 00:00:00.
There doesn't seem to be any mention of SQL Server on the site? I am in the process of researching a move to Postgres from SQL Server, so if you have any other info on how you achieved this that'd be very much appreciated.
EDIT I see it now. Still, anything that you think useful would be great :)
Nice to hear that it works so well already. It is a very new feature which was sponsored by the company I work for only a couple of months ago. We are not even done migrating our database yet (we have to convert stored procedures and the queries in the application too).
No foreign key support is kind of a deal-breaker. Hard to believe anyone who is actually in the position of needing this would have a schema so simple that it contains no foreign keys.
Ah. That's a relief, I was a little worried. But that makes it strange that someone, knowing of this limitation, would choose to start with the xml dump format.
Well, I've had issues with pg_dump's standard format. It can't be directly used to restore a db, one must manually set it to dump a .bak or something else.
Not sure what you refer to here. All pg_dump formats should work for restoring a database. The only flaw in pg_dump that I know of is the mess if you want to dump an entire cluster of databases but still have one file per database.
We were getting really close to converting our RDS instance from MySQL -> PG, but now with Amazon Aurora being the spiffy super-scalable/etc. RDS engine, we'll probably end up moving to that when/if needed.
Which is a mixed blessing; Aurora sounds great, but I was looking forward to getting back into PG after years of RDS being MySQL-only.
Yeah, seems like postgres is the only database that doesn't support those things. I mean, I use postgres, but I curse the lack of this feature regularly.
I have tried to read into why PostgresSQL doesn't yet (they are working it) have upserts. Several times I have come across people discussing that Merge which is much more powerful the simple upserts, also doesn't really mandate the nature of upserts in most databases.
Simply a user might expect that upserts always succeed with either an update or insert, and never return an error (except when the consistency model is set high enough).
The problem is that many implementations of merge doesn't provide that guarantee, unbeknownst to many users. Or they are simpler databases such as sqlite (which doesn't provide the same multi-user/transaction performance). Postgres, as they are well known for, want to implement it properly. They know that application developers does not expect that they have to resubmit failed upsert. Aside: they are also working on audit features, which presents a number of implementations difficulties that upserts also does. So some time or later, probably not to far away, we will see upserts in postgres. Proper atomic, performant, nice upserts.
Here it actually matters a bit. The pgloader project was rewritten in Common Lisp from the original Python version to speed up processing the data. You want to be able to migrate terabyte databases without too much downtime.
Just because a few very disciplined people can make something clean and functional in a language does not mean it's a strength of the language. PHP is great as a template language, but when it comes to manipulating data it's somewhat louche approach to data types is a liability.