Hacker News new | past | comments | ask | show | jobs | submit login

I also helped with a project once that used pgloader to migration an old MySQL db to Postgres, and I think this article may actually explain one of the issues we found. We had a UTF-8-configured table in MySQL (we also discovered in the project, that MySQL can have encodings set differently per-table, while Postgres sets the encoding for the entire database) with UTF-8 data, and when we migrated it into a UTF-8-configured Postgres database, some of the UTF-8 characters were silently corrupted. They were UTF-8 characters in both databases, so the corruption didn't raise any errors, but they were different characters, so when the data was read back out by the application, the text was different.

We only caught this, because thankfully, we had written a manual checksum script, which looped through every table, read out all values from each row into the application, and hashed the results, then compared between when the app was connected to the source MySQL database vs the destination Postgres database. We ended up having to massage and fix those silently-corrupted characters.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: