Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Very cool. What options do you pass to `mysqldump' for best diffability?


Straight from the dump scripts:

  --complete-insert --hex-blob --skip-add-drop-table --single-transaction --order-by-primary --skip-dump-date
  --force # so mysqldump does not bail out on invalid view
  --no-create-info # do not emit CREATE TABLE, because its AUTO_INCREMENT changes often, and would create unnecessary differences

where the `order-by-primary' is probably the most important, and `skip-dump-date' sure helps.

Also, I make a big deal out of spreading every large table into a set of smaller dumps, each with fixed number of rows, sorted by record ID. For various reasons, most of our tables are usually appended-to, and changes (UPDATE, DELETE) are less common. Thanks to this, changes are usually confined to the last file of a set (with newes records), and other files stay mostly unchanged -- and so they pack the best.

  --where="_rowid >= $FROM AND < $TO"
I try to keep individual files down to about 8...16MB, 32MB max, so git's repack (upon pull/push/automatic gc) doesn't take too much of time nor RAM.


How do you split the files? Is that part of mysqldump (if so, how), or is it a handrolled thing?


For now, handrolled. The idea is to be able to do either `cat * .sql' or just `cat LAST-PART.sql'. I run mysqldump once per each large table with --where="_rowid >= $FROM AND < $TO" argument to mysqldump, and call mysqldump in loop with consecutive $FROM and $TO. It works, it gets the job done, but it's not transaction safe.

That `_rowid' is a reserved symbol in MySQL. Refers to table's PRIMARY KEY (but only if it's single INT). In the usual case, the script doesn't have to know table's PRIMARY KEY.

Another way would be to use `rolling checksum' to split files; the concept described in http://beeznest.wordpress.com/2005/02/03/rsyncable-gzip/ But you could end up with dump files split in the middle of SQL statement, not very cool.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: