Objective-C Blocks vs. C++0x Lambdas: Fight

comex · on June 4, 2011

> One place where useful optimizations could be made are inline functions which take block parameters, since the optimizer is able to improve the inlined code based on the calling code. However, as far as I know, no current blocks-capable compilers perform any of these optimizations, although I haven't investigated it thoroughly.

> [...]

> for_each is a template function, which means that it gets specialized for this particular type. This makes it an excellent candidate for inlining, and it's likely that an optimizing compiler will end up generating code for the above which would be just as good as the equivalent for loop.

This is somewhat unfair, as it seems to reflect the common misconception that (e.g.) C++ sort is faster than C qsort because it uses templates, when in fact qsort would be just as fast if its implementation were written in the .h file, as C++ sort's is. Compilers are perfectly capable of inlining calls to function pointers.

Calls to blocks should be able to be inlined too, but I guess they're still a new feature; I did a quick test, and it seems that gcc cannot inline them, but llvm-gcc and clang can.

In this case, the article suggests:

   [array do:^(id obj) {
       NSLog(@"Obj is %@", obj);
   }];

Objective-C message calls are never inlined since they are dynamic, but if you write something like:

  static void for_each(NSArray *array, void (^callback)(id)) {
     for(id obj in array) {
         callback(obj);
     }
  }
  [...]
  for_each(array, ^(id obj){ NSLog(@"%@", obj); });

llvm-gcc and clang are able to generate code equivalent to doing the for loop directly.

ot · on June 4, 2011

> Compilers are perfectly capable of inlining calls to function pointers.

Does this happen even if the function that calls the callable (in this case sort) is not inlined? This would mean that the compiler generates a specialized function, say sort_{somerandomnumber} with the user-defined callable inlined into, and this seems kind of unlikely to me (while with templates the compiler is forced to do so).

In both your example the body of the caller (array do and for_each) is so small that I assume it is inlined.

comex · on June 4, 2011

No, but you could get the same effect by writing a short wrapper function and getting sort inlined into that. (Might be difficult in some cases, but I could imagine a "qsort_efficient" which is declared __attribute__((always_inline)).)

mustpax · on June 3, 2011

This really captures the core difference between the two languages. C++ is overly versatile to the point of being cumbersome whereas Obj-C imposes sensible limitations for the sake of handling the common use cases much better.

cageface · on June 4, 2011

Or, C++ gives you speed and low-level control at the cost of some extra complexity.

iam · on June 4, 2011

There's hardly extra complexity in C++0x lambdas, you just need the & or = character to specify if variables are captured by value or by reference and off you go. You can even save the closure as an auto variable instead of needing to spell out the type! Seems easier to me.

Personally I prefer the C++0x closures precisely because of the reference/value capturing distinction.

fauigerzigerk · on June 4, 2011

I prefer the C++ solution as well because detailed control of everything is what I use C++ for, but saying there is no extra complexity "you just need the ..." is funny because you could say the same about everything that's causing extra complexity.

Also, the requirement to add an empty parameter list whenever there's something between [] and {} amounts to extra complexity. This seems to be a purely syntactical dismbiguation thing as neither the mutable qualifier nor the return type should be related to an empty parameter list.

We have to realise what complexity is. Part of it is having to think of B whenver we do A even though we do not wish to say anything about B at all.

pilif · on June 4, 2011

As far as I understood the article, using references is a desaster waiting to happen as accessing the references once the original variables are out of scope is "undefined behaviour" - probably accessing random values on the stack if your reference was pointing to a variable on the stack

shin_lao · on June 4, 2011

Not using references doesn't help you at all, you have to recursively make sure the lifetime of all the objects you use is greater than the lifetime of your closure.

calloc · on June 3, 2011

I don't necessarily consider this a fight at all, I think each has their strengths and weaknesses. I much prefer the C++ method, however when programming in Objective-C using blocks just feels more natural.

barrkel · on June 3, 2011

The C++ method doesn't give you a trivial way of capturing variables rather than values, where those variables go on to have a life that matches the lifetime of the closure created. You have to work rather hard to get that, by wrapping things up in little holders which are captured by value, but don't do a deep copy when they themselves are copied. It reminds me a little of how arrays are sometimes used to capture final variables in a mutable way in Java anonymous inner classes.

What this means is that whenever you pass off a lambda to a function, and that lambda captures a variable, you need to be aware of whether or not the function keeps a reference to the lambda someplace, so that you'll know if it's safe to capture by reference or not. If you capture by reference but the function keeps a reference to the lambda, and it gets called after the captured variable has gone out of scope, you'll get into some nasty trouble.

And this is why I don't like the C++ spec as it stands. It requires a kind of global knowledge to work with correctly in local contexts, in such a way that the compiler can't really help you either (it may have been possible to annotate types to indicate closure lifetime, but it would be painful without more powerful type inference than C++ has).

bitwize · on June 3, 2011

You simply can't do upward funargs in a C-family language without breaking the memory and activation-record model of the language. For example, let's say C++ had upward funargs. Any time you return a function value, or otherwise keep it around for longer than the lifetime of the function activation in which it is activated, any free local variables in the closure which are captured by the containing environment must refer to locations on the heap. (Copying them by-value breaks the semantics of closures.) This conflicts with the assumption in C++ that auto variables local to a function activation are part of the function's activation record on the stack.

You could get around this problem with spaghetti stacks or something, but you'd need to find a way to free the activation records that are floating around after their enclosing scopes have expired -- enter garbage collection which you do NOT want to require for C++.

That's the problem with Lisp, it's almost all-or-nothing. If you want to correctly include some of the benefits of the Lisp execution model -- like lambdas -- you need to accept the whole thing, lock, stock, and barrel. Including heap-allocated local vars and the garbage collector. (And yes -- Python, Ruby, Haskell, and Standard ML have a "Lisp execution model" in this sense.)

So we get compromises and hulking abominations like C++0x lambdas or -- worse yet -- their predecessor, Boost Lambdas.

tl;dr: Upward funargs are hard, let's go shopping.

ori_b · on June 3, 2011

You can do manual closure management.

    int f() {
        int x = 42;
        return dupclosure(^(){return x+10}};
    }
     int g()
     {
          int (^fn)();

          fn = f();
          fn();
          freeclosure(fn);
     }

It's not quite as pretty, but it's workable, and in line with a C-family language's semantics.

seabee · on June 4, 2011

However it does require severe adjustments to the lifetime rules for automatic variables, and you have to be mindful of e.g. custom allocators.

ori_b · on June 4, 2011

Custom allocators are relatively easy handle with an API something like:

     void* closureheap(void (^)())
     size_t closuresize(void(^)())

Or, well, anything else that allows you to separate the step of allocating memory from it's initialization. (There are probably more representation-independent APIs that would be better, but this was just off the top of my head)

And, yes, mutable captured variables will not be shared across multiple duplications. Somewhat unconventional, but given the mental model of copying closures, I don't think it's a surprising behavior.

barrkel · on June 3, 2011

I'm aware of all the issues; I implemented the feature in a commercial language that's semantically very similar to C (Delphi, a variant of Pascal). My point is that there are compromises, like reference counting, that make the whole thing much easier to use. Yes, reference counting is a form of GC, but it's also deterministic and localized, and with careful selection of implementation primitives in the runtime library, potentially open to user fiddling too (as C++ users are wont to do).

comex · on June 4, 2011

> [...] any free local variables in the closure which are captured by the containing environment must refer to locations on the heap.

Objective-C blocks automatically copy those variables to the heap. (Which is not to say they're not a bunch of compromises.)

calloc · on June 3, 2011

The issue you mention of not knowing the lifetime is something that happens with pointers and memory allocations as well in C++. Unless it is documented it can be a pain trying to figure out who is ultimately responsible for free'ing the memory that was allocated ...

Ultimately Garbage Collection would help there, but I am not sure we are going to see that in C++ anytime soon.

barrkel · on June 3, 2011

Yes, but the solution landscape to this problem domain is different. Automatic variables have well-defined lifetimes; they die when they go out of scope. This means there's greater scope for the compiler to take more initiative about lifetime of captured variables (which will all be automatic one way or another, i.e. implicit 'this', a local or a parameter; assume reference parameters etc. cannot be captured).

When I designed and implemented the same feature, anonymous methods in Delphi, I used reference counting to keep alive a heap-allocated activation record containing all captured variables. This works well for most scenarios; it can get into knots in more obscure situations where you have recursive lambdas that call themselves via a captured variable, but those are usually pretty rare.

You're right that GC is a help. The biggest thing GC gives you is freedom from having to worry about who controls the lifetime of parameters and function return values, in most cases. In the presence of GC, you can get more clever about your algorithms and data structures; you can cache and memoize, without paranoid concern for things disappearing behind your back. Consider a querying API that takes in closures for sorting and selection functions; I can see it building up temporary results and caching them, or streaming results in a multi-threaded fashion, but it can only do that if it can reliably hang on to closures after the select/sort/etc. function has returned.

zwieback · on June 4, 2011

That's what I was thinking when reading the article. It almost seems like capturing stack variables by reference should not be allowed at all but that would be too restrictive.

Ultimately the C++ problems have to be resolved by conventions and idioms, just like we've all learned to be careful when passing a pointer to a local variable.

stephen_g · on June 4, 2011

What would garbage collection do that C++0x's shared_ptr smart pointer (which is reference counted) doesn't?

I'm not really sure how GC works but using normal memory allocation when you can and reference counted pointers when somebody else is responsible for deallocating objects seems to be adequate and is still high performance...

shin_lao · on June 4, 2011

You seem to want to use a C++ 1x lambda where you should be using a future, am I wrong?

lloeki · on June 3, 2011

I'd go as far as saying that (Obj)C blocks feel right at home in pure C (where I actually used them more than in ObjC) whereas C++ lambdas fit in, well... C++.

I just wish the actual passive-aggressive fight between the FSF and Apple would resolve and blocks could finally make it into upstream GCC C compiler.

calloc · on June 3, 2011

There is no-one at Apple that can sign the copyright for the blocks code over to the FSF and as such it will never happen. At least that was told on the mailing list for LLVM/Clang.

I understand why the FSF wants copyright assignment, but it makes the process a lot longer and more complicated.

lloeki · on June 3, 2011

Probably Jobs can?

ben_straub · on June 3, 2011

Not likely. Apple's moved to a clang-based toolchain, haven't they?

lloeki · on June 3, 2011

They're still transitioning from pure gcc first to llvm-gcc then to llvm+clang, and blocks are available in all of them.

XCode 3.2 defaults to GCC 4.2 with LLVM optionally available and IIRC XCode 4 too (can't check as I downgraded for various reasons).

The blocks patchset against pure GCC exists, and the problem mostly lies in upstream GCC refusing patches whose copyright has not been assigned/transferred to the FSF (see https://lwn.net/Articles/405417/). The rationale is that a critical component such as GCC should not be at the mercy of multiple (possibly hundreds) conflicting parties and easing a possible relicense process to ensure its protection.

While I understand the rationale behind this, my opinion is that it feels bureaucratic to the point of hampering notable innovative contributions while favoring local forks which will inevitably end up dying, as maintaining a fork (whatever the patchset size) against the march of a behemoth like GCC is essentially hopeless.

_tggb · on June 3, 2011

XCode 3.2 defaults to GCC 4.2 with LLVM optionally available and IIRC XCode 4 too (can't check as I downgraded for various reasons).

Xcode 4 defaults to llvm-gcc, not gcc-4.2.

lloeki · on June 4, 2011

I wasn't quite sure about my memory of it when I wrote it, thanks for the correction. That's still not clang though.

bonch · on June 3, 2011

It's not a fight; Mike Ash was just using a tongue-in-cheek headline.

saurik · on June 4, 2011

Objective-C's blocks would be a million times more interesting to me if they allowed for runtime type inspection: given that they are already fully fledged Objective-C objects, the fact that they don't have some kind of "methodSignature" selector that returns the type code is simply confusing. If they had this feature, then they would be fully usable from dynamic languages (such as my Cycript, but also things like PyObjC, Nu, etc.) without having to manually specify the types everywhere (which is error prone and annoying, especially if all of the rest of your code is working just fine with no types specified at all). Without this feature, they seem to just be a really limited version of C++0x lambdas tied to Objective-C's reference counting semantics.

Johngibb · on June 3, 2011

Boy, do I prefer how lambdas in C# work.

stonemetal · on June 3, 2011

The article is more interesting than the juvenile title makes it sound.

tldr: Lambdas provide more flexibility but are more complicated to use. Blocks integrate with objective-C better and are simpler to use.

mikeash · on June 3, 2011

The title is tongue in cheek, basically a riff on how many people consider C++ and Objective-C to be mortal enemies.

rudiger · on June 3, 2011

I liked the title...