Hacker News new | past | comments | ask | show | jobs | submit login

if you want to copy everything and there's nothing at the target:

rsync --whole-file --ignore-times

that should turn off the metadata checks and the rsync block checksum algorithm entirely and transfer all of the bits at the source to the dest without any rsync CPU penalty.

also for this purpose it looks like -H is also required to preserve hard links which the man page notes:

"Note that -a does not preserve hardlinks, because finding multiply-linked files is expensive. You must separately specify -H."

be mildly interesting to see a speed test between rsync with these options and cp.

there are also utilities out there to "post-process" and re-hardlink everything that is identical, so that a fast copy without preserving hardlinks and then a slow de-duplication step would get you to the same endpoint, but at an obvious expense.




I use rsync to replicate 10TB+ volumes without any problems. It's also very fast to catch up on repeat runs, so you can ^C it without being too paranoid.


It sounds like OP had large hard-link counts, since he was using rsnapshot. It's likely that he simply wouldn't have had the space to copy everything over to the new storage without hard-links.


You cannot exaggerate what rsync can do... I use it to backup my almost full 120gb Kubuntu system disk daily, and it goes through 700k files in ~ 2-3 minutes. Oh, and it does it live, while I'm working.


To be fair: the case described here is two magnitudes larger in both directions than your use case.

And the critical aspect from a performance perspective was where the hash table became too large to fit into memory. Performance of all sorts goes pear-shaped when that happens.

rsync is pretty incredible, but, well, quantity has a quality all its own, and the scale involved here (plus the possible in-process disk failure) likely wasn't helping much.


Yep, my answer when cp or scp is not working well is always to break out rsync, even if I'm just copying files over to an empty location.

I've had good luck with the -H option, though it is slower than without the option. I have never copied a filesystem with nearly as many files as the OP with -H; the most I've done is probably a couple million, consisting of 20 or so hardlinked snapshots with probably 100,000 or 200,000 files each. rsync -H works fine for that kind of workload.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: