A fun optimization trick from rsync

vortico · on Oct 29, 2018

I don't like it. You should put the logic where the logic should be, so it's clear to see what was intended.

    typedef int (*MethodFunc)();

    void doMethod() {
        static int methodId = 0;

        static const MethodFunc methodFuncs[] = {
    #ifdef METHOD_1_CAN_COMPILE
            method1,
    #endif
            method2,
            ...,
            NULL
        };

        while (methodFuncs[methodId]) {
            int success = (methodFuncs[methodId])();
            if (success)
                return 0;
            methodId++;
        }

        return -1;
    }

blux · on Oct 29, 2018

In case performance is all you are after, this has much less potential for inlining than the rsync implementation has.

vortico · on Oct 29, 2018

True, but for rsync's case of setting the mtime of a file, performance of this logic is utterly insignificant.

bayindirh · on Oct 29, 2018

"Everything is fast for small n." -Knuth.

While syncing some million files, I'm sure it'll have an impact tho.

P.S.: Yes, we use it for a million files, regularly.

blibble · on Oct 29, 2018

it's probably not noticeable for a million files

the syscall is going to be many, many orders of magnitude slower than the difference between a direct and an indirect call

bayindirh · on Oct 29, 2018

IIRC, switch statements bigger than a certain sizes are compiled to jump tables. So it's just a static jump after the code finds a working method.

In theory, it's a very cheap piece of assembly after that point.

vortico · on Oct 29, 2018

With trivial method() functions, you could speed through that loop a million times in a millisecond.

stouset · on Oct 29, 2018

If all of the methodFuncs fail, this will attempt to jump to whatever address happens to be in memory after the last element of methodFuncs. You've forgotten to stop iterating.

ahoka · on Oct 29, 2018

There is a terminating NULL at the end of the array to stop the iteration.

jbverschoor · on Oct 29, 2018

oh my... that never went wrong

dsamarin · on Oct 29, 2018

A terminating null element or logical equivalent in an array is helpful in lots of situations. As one example it simplifies the logic of looking for an element in an array and performing some special operation if it was not found. No bool found variable necessary.

jbverschoor · on Oct 29, 2018

Well yes, but it can easily be removed by accident, and is easy to oversee. See the parent comment

matthewaveryusa · on Oct 29, 2018

That's not clever, that's language abuse and horrific -- You could do this in so many other ways there is absolutely no point in abusing #includes mid-source-code (without mentioning the code smell around statics.)

Now, as a working curiosity, this is pretty damn cool.

nwmcsween · on Oct 29, 2018

Not really its just a static var on a switch statement nothing really bad about it, in fact its a semi nice way to handle a simple state machine.

repsilat · on Oct 29, 2018

Not tooo uncommon either. It's a "reasonable" way to write generators in C (if you're masochistic Python programmer, I guess.)

Switch on the last return point at the top of the function, and make all variables that need to persist between calls `static`.

(I've written it before, and I swear I didn't invent it... Maybe it's just a short skip from Duff's Device? Still, using it to pick up before the last return instead of after is new trick for me!)

dimatura · on Oct 29, 2018

Yes, it's basically the idea in Duff's device. Protothreads (http://dunkels.com/adam/pt/) uses a pretty similar trick.

repsilat · on Oct 29, 2018

Ah, that's exactly the trick I was thinking of. I think I learned it from [1], but I'm pretty sure Protothreads predates that post. Thanks for the link!

1: https://www.chiark.greenend.org.uk/~sgtatham/coroutines.html

_0w8t · on Oct 29, 2018

This was modeled on early Fortran where all local variables were static and it was trivial to implement persistent state machines.

geezerjay · on Oct 29, 2018

To my eyes it also looked like a Duff's device which kept track of the successful state.

Piskvorrr · on Oct 29, 2018

Yup. First thing that popped into my mind - with Duff's observation following shortly thereafter. Clever...too clever by half, IMNSHO.

imron · on Oct 29, 2018

> without mentioning the code smell around statics

Or the code smell around fall through switch statements.

There are legitimate uses for fall through switch statements and/or mid-file includes, and or static function vars, but combining all of them together is not clever, it's a recipe for bugs.

ktm5j · on Oct 29, 2018

Im with you. This is great and clever, but I would have to maintain code like this.

usefulcat · on Oct 29, 2018

An interesting trick, but it seems like there would be a simpler way, like having set_the_mtime be a function pointer. Initially it would point to an implementation that figures out which method works and sets the function pointer to point to a particular implementation.

beefsack · on Oct 29, 2018

Sounds really clever, but also has the potential to be really opaque and confusing. On the other hand, the trick in the article is quite verbose.

Can be tricky balancing terseness with readability.

usefulcat · on Oct 29, 2018

> has the potential to be really opaque and confusing

Yes, but.. compared to switch and preprocessor hacks described in the article? Granted it's subjective but I'd find a function pointer that's assigned to once to be way easier to comprehend.

carlmr · on Oct 29, 2018

Yeah, I think both methods are somewhat opaque, and the one described here is opaque + hacky. I wouldn't let this through code review.

Sharlin · on Oct 29, 2018

I would say that this switch hack is lightyears more "clever" (that is, opaque and confusing) than a simple function pointer.

ychen306 · on Oct 29, 2018

It's more inliner-friendly this way though.

megous · on Oct 29, 2018

No. Compiler can't assume value of switch_step at compile time, so it would have to inline the whole horrid thing every single time.

corndoge · on Oct 29, 2018

no guarantee the signatures are the same on all functions, can't use a fp

jfries · on Oct 29, 2018

In that case wrapper functions can be created where needed to normalize the signatures.

bayindirh · on Oct 29, 2018

Creating signature normalization wrappers, which contain superset of all the function parameters really complicates the code a lot.

It's like trying to push anything and everything through a little hole.

kochthesecond · on Oct 29, 2018

Have you heard of our Lord and savior, the single method interface?

alecbenzer · on Oct 29, 2018

Awesome, that's horrifying.

One concrete issue with this technique: it doesn't "loop". Meaning, for example, if initially method #3 doesn't work but method #4 does, it'll settle on method #4. If later, method #3 starts working, it won't switch it. Worse, if method #4 stops working, it'll just always fail.

I imagine this isn't an issue for rsync's use but could be a concern if applying it more generally.

lilyball · on Oct 29, 2018

According to the article, a step isn't skipped because it had a transient failure, it's skipped because it returns a special error code ENOSYS that means the syscall hasn't even been implemented, and therefore there's no point in ever trying it again.

carlmr · on Oct 29, 2018

That's correct. Although it's a very specific use-case.

scatters · on Oct 29, 2018

AIUI with containers it's now entirely possible that a running program may be migrated from a host where a syscall has not been implemented to one where it has.

vortico · on Oct 29, 2018

Just remove the `static` keyword and there you go.

But it's an "optimization", and avoiding previously attempted methods is the whole point of the trick. Otherwise it would be an easy and uninteresting function.

lazyjones · on Oct 29, 2018

> Just remove the `static` keyword and there you go.

That‘ll probably try all methods on every single file that is processed, so if method #16 is the one that works, you get 15 unnecessary syscalls for every file.

vortico · on Oct 29, 2018

Exactly, that's what the parent comment was saying.

Sean1708 · on Oct 29, 2018

Lets say that there are only 3 methods and method #2 is the one that works initially. On the first call it'll try #1 and when that doesn't work it'll try #2 which will be successful. For the next however many calls it'll jump straight to #2. If #2 starts returning ENOSYS at some point it'll try #3 and if that works then all future calls will jump straight to #3. However if #3 stops working then it will keep on failing until #3 starts working again even if #1 started working after the first call.

I think (I haven't actually tested it) you can do what the parent commenter wants by wrapping the whole thing in a do-loop but the original implementation is questionable enough, sticking an extra loop in there isn't doing anything to help make it less horrible.

  int set_the_mtime(...) {
    static int switch_step = 0;

    int loop_again = 1;

    do {
      switch (switch_step) {

      case 0:
        if (method_0_works(...))
          return 0;

        switch_step++;
        /* FALLTHROUGH */

      case 1:
        if (method_1_works(...))
          return 0;

        switch_step++;
        /* FALLTHROUGH */

      case 2:
        if (method_2_works(...))
          return 0;

      }

      switch_step = 0;
    } while (loop_again--);

    return -1;   /* ultimate failure */
  }

kelnage · on Oct 29, 2018

My understanding of the code from the article is that if method #4 stopped working, it would just carry on down the list. The static variable just means where to skip to in the case list.

ThePhysicist · on Oct 29, 2018

Andrew Tridgell’s thesis is also a very interesting read, as rsync has some pretty innovative algorithms that make the file transfer and remote diffing fast:

https://rsync.samba.org/~tridge/phd_thesis.pdf

ecma · on Oct 29, 2018

Thanks, I hate it. I'm a bit bemused about this bizarro case_N.h trickery. I'm also almost positive __COUNTER__ would be adequate and entirely remove the need for __LINE__ as suggested in the addendum which is likely to be quite large in any non-trivial source unit. I mean, something gcc this way comes I guess...

scatters · on Oct 29, 2018

__COUNTER__ is not standardized. Repeated inclusion is.

netheril96 · on Oct 29, 2018

So thread unsafe?

romed · on Oct 29, 2018

Note that if this were C++ (11 or later) this could be done with the static variable in a thread-safe way, since function-scoped static variables initialized by functions are guaranteed to be initialized exactly once. So this code could be something like:

  int set_the_mtime(...) {
    static int starting_from = set_mtime_starting_from(0);
    set_mtime_starting_from(starting_from);
  }
  // Try various ways of setting mtime until it works.
  int set_mtime_starting_from(int starting_from) {
    // horrible switch statement
  }

repsilat · on Oct 29, 2018

In C++ you could also make an object (per thread?) and use member vars instead of static vars.

Also, if you're going to use static vars in multithreaded contexts doing this, better to write

   thing = 5;

(via that macro magic) than `thing++`. Then at least nothing too bad should happen...

usefulcat · on Oct 29, 2018

Definitely

tropo · on Oct 29, 2018

It doesn't have to be. It just is, because there is no need for the overhead of dealing with threads.

jbverschoor · on Oct 29, 2018

So without actually looking into anything. What id you have a directory structure which is built out of 2 mounts. One local fs and one network mounted. Then lets say method 5 works on one, and method 1 works on the other, both are the only ones that work. Then it would basically fail, depending on the order of directories enumerated, eventhough there is a supported method.

de_watcher · on Oct 29, 2018

I haven't looked at the code, but the ENOSYS error is more about the capabilities of the kernel. So when we eventually start swapping entire kernels at runtime then it'll break.

bacon_waffle · on Oct 29, 2018

Edit: d'oh - I misread, the addendum is doing exactly this...

I don't understand the need for the macro-fu to generate the case values. The use of switch_step++ in each previous case seems to drive that, but couldn't the same be accomplished by setting switch_step to a constant?

  static int switch_step = 0;
  switch(switch_step) { // Falls through between cases
    default:

    #if METHOD_0_AVAILABLE
      switch_step = 0;
    case 0:
      if (method_0_works(...))
        break;
    #endif METHOD_0_AVAILABLE

    #if METHOD_1_AVAILABLE
      switch_step = 1;
    case 1:
    ...
  }

OskarS · on Oct 29, 2018

So, slightly off topic, but one thing from C that I really miss in other languages is subroutine-local static variables. It always annoys me in Java or C# that you can't really make a static variable that's scoped to a single method or function, you have to scope them to the entire class. It's common enough that a private static variable is only needed in a single method or function, so why not do it the way C does it?

VMG · on Oct 29, 2018

In js you can use `this.x` or `this.methodname.x`

Jedd · on Oct 29, 2018

rsync is an amazing tool, full of delightful secrets.

One thing it doesn't have natively (unless it's really well hidden) is the ability to look at two directory structures, and generate a delta of the two out to a separate location (say a USB stick).

A very useful feature for backups where you've got good enough (to run an rsync --dry-run) network connectivity, but not good enough to actually do the transfer.

rstuart4133 · on Oct 29, 2018

Unless you actually want whole files stored in a directory tree on your USB, --only-write-batch does exactly that. From the man page:

   you can feel free to write the batch directly to some portable media

Jedd · on Oct 29, 2018

Wow .. how long has that been there?

(Sadly it's clear it was in fact there long before I wrote a workaround in response to not finding it.)

lazylizard · on Oct 29, 2018

can rsync --dry-run and | output to a file?

Jedd · on Oct 29, 2018

Yup, that's what I ended up doing [1] - but was mildly surprised it wasn't built in, and very surprised that I couldn't find anyone else's solution.

[1] https://jeddi.org/b/2016/05/28/rsync-from-a-to-b-via-usb/

akvadrako · on Oct 29, 2018

I think he means a diff that includes the differences in content.

waingake · on Oct 29, 2018

I'd be interested if someone could show me a FP friendly solution to this problem.

toolslive · on Oct 29, 2018

Some compilers turn large switches into lookup tables. Using side effects like this however, makes the compiler shy away from this optimisation, and compile it literally. So it's probably not a good idea anymore.

ygra · on Oct 29, 2018

I'm not sure switch performance is a relevant concern for setting the last modification time of a file.