> I'm only creating /tempdir/$inodenumber if the hard-link count is >1 Which (ag...

pedrocr · on Sept 14, 2014

> My algorithm works with the nlink==1 optimization just as well: just don't put those files into the file list that needs to be sorted, instead copy them immediately or put them into a list that doesn't need sorting.

If you do that then your extra storage requirements are now O(paths to inodes with nlinks > 1) which is still strictly higher than O(inodes with nlinks>1). You're also doing two passes, one to enumerate the files, another to do the copying and hardlinking, and doing an extra operation (the sort). I'd be very surprised if that was faster, and unless you have an on-disk sort it will still be RAM+swap limited instead of just disk-space limited.

pflanze · on Sept 14, 2014

> O(paths to inodes with nlinks > 1) .. O(inodes with nlinks>1)

Yes, that's right, I ignored that part in the calculation above.

By using the filename+parent pointer approach for path compression, that could be improved. (The current version of megacp also stores all necessary stat information (size, uid, gid, perms) in the tempfile to avoid a second stat call; this could be dropped if it makes matters worse.)

Also, quite likely using a fast compression algorithm like lzop on the file would make disk access faster and disk storage requirements lower.

> You're also doing two passes, one to enumerate the files, another to do the copying and hardlinking

Right now I'm doing exactly the same number of system calls as the cp approach needs. The second pass doesn't do readdir or, currently, even stat on the source file (that's why I saved the stat info above; which is more beneficial would need to be tested). It may be beneficial to sort the sorted-by-inode file back into sorted-by-path after the inode cluster recognition step to get original path locality during the copy phase (the included test/genfiles script already does that); I'll have to test that.

> unless you have an on-disk sort it will still be RAM+swap limited

Using an online algorithm is the whole point, I've been saying that from the start. And I remember that Knuth wrote about them so I conclude that they exist for sort, and I quite expect that GNU sort(1) is already sorting big files efficiently (limited testing seems to confirm that, files several times the size of RAM sort just fine in the low single digit minute count).

> I'd be very surprised if that was faster

You're ignoring how much slower disk seek is compared to serial reading/writing. A lot of similar optimizations are done in OSes, databases etc. for that reason. I guess you'll argue next that SSDs don't have such high seek times (and that everybody should be using them), and while those points are right SSDs are still a lot slower for random versus serial access, so whether the overhead of the sorting approach is warranted on SSDs remains to be tested. In any case, I very much expect it to be faster in the OP's scenario which was with spinning disks (note how it took him many days; I expect the sort for the same data to take few hours or quite likely less than an hour given his RAID array).

pedrocr · on Sept 14, 2014

> Also, quite likely using a fast compression algorithm like lzop on the file would make disk access faster and disk storage requirements lower.

>Using an online algorithm is the whole point, I've been saying that from the start. And I remember that Knuth wrote about them so I conclude that they exist for sort, and I quite expect that GNU sort(1) is already sorting big files efficiently (limited testing seems to confirm that, files several times the size of RAM sort just fine in the low single digit minute count).

The cp algorithm already solves this just fine and my proposed reimplementation on-disk solves the RAM usage problem. You're trying to make a strictly worse algorithm work. It both requires more storage and is heavier than it needs to be (full sort vs just a map). What do you expect to gain from it?