Hacker News new | past | comments | ask | show | jobs | submit login
Lambda expression comparison between C++11, C++14 and C++17 (maitesin.github.io)
175 points by ingve on May 14, 2016 | hide | past | favorite | 85 comments



I love lambdas, but a lot of commenters are throwing around the word "closure" here, and c++ lambdas are definitely not closures. You can capture outside variables by value or by reference, but that value can expire before the lambda runs if the reference no longer exists; in which case, you are in trouble. Unlike a true closure (as in lisp or other languages), where the closed-over value stays around.

If we talk only about c++ capture lists by value (i.e. [=]), then you could make a case for a more appropriate use of the word "closure" but since many lambdas do more than this, I think the distinction is necessary.

However, even in by-value captures, if you are capturing a pointer by value, the issue remains. So, really it is not a good idea to think of lambdas as closures in the functional sense typically used in other languages.


I find that pointlessly pedantic. By the same measure, languages that don't offer bignums don't offer integers. Even Lisps need to implement closures in one way or another, and you may be surprised to see how they actually do it.


I think you miss the point, the OP was mentioning the property that in most languages, closures whose outer variables they are bringing into scope stay in scope (even if the outer function ends). In C++ they expire (is what I got from his comment, anyway).


Closed-by-value variables (I.e. the default) in C++ don't expire. Referenced or pointed-to object might, but this is completely consistent with the rest of the language.

Remember that C++ is a by value language. Pointers are explicit, fist class and distinct from the pointed to object.

References are werid, though.


You would have to combine copying and reference management, e.g. "[=]" and std::shared_ptr<>. It definitely requires the programmer to pay more attention though, compared to other languages/constructs.


I'm not familiar with closures from other languages. What about capturing a shared_ptr by value?


The thing to keep in mind is that copying a shared_ptr isn't cheap at all. It's a class with a pointer and atomic reference count inside and the atomic inc/dec takes many cycles.


How does this compare to the cost of a closure in other languages? Yeah atomic reference counts are not cheap, but basically that's the point of a shared_ptr.


I'm not sure. I don't program in C++ because I want it to have performance comparable to other languages.


Nice overview. Just a small correction: the smallest possible lambda is

    []{}
and not

    [](){}
as stated in the article. The parameter list is optional.


>The parameter list is optional.

To avoid confusion, only an empty parameter list is optional. You can't omit the parameter list if you take args, unlike lambda shortcuts in languages like clojure which allow you to implicitly refer to arguments inside the body using a universal variable name, and therefore omit the parameter list for all lambdas. That possibility does not exist in c++.


Another mildly obscure feature of lambdas is the ability to capture a variadic number of parameters.

Example (slightly contrived):

  #include <future>
  #include <iostream>

  template<typename... Args>
  void log(Args&&... args) {
      (std::cout <<  ...  << args) << std::endl;
  }

  template<typename... Args>
  std::future<void> log_async(Args&&... args) {
    return std::async(std::launch::async, [args...] { log(args...); });
  }

  int main()
  {
      auto f = log_async(1, 2, 3);
      f.wait();
  }


For those of us still stuck on C++98 at work, would you mind explaining what's going on here? In particular, I can't figure out why the ellipsis is so separated from `args` here:

        (std::cout << ... << args) << std::endl;
That looks like some black magic to me. The rest makes sense, I think.



Neat! Thank you :)


Heh, great. I used to put an ellipsis into a programming example to mean “fill in whatever you actually do here”, and now C++ went and made it mean something. :)


Well, I feel like an old man telling kids to get off my lawn, but this syntax looks ridiculous. :/


Which standard version?


That snippet depends on fold expressions, which are in c++17. AFAIK, capturing a variadic parameter pack should work in C++11.


Taking a concept such as a lambda function and making it look this ugly...this is why I hate C++. I wish I wasn't forced to program it every day.


How is it ugly? The capture list is a necessary complexity in a language with manual memory management.


It's a trade off more than a necessity. For example, Rust doesn't have explicit capture lists, and if you want explicit control, you make new bindings and capture those. You almost never need to do this in Rust, so it's optimized for that case; I haven't written many closures in C++, so I can't say as much about the frequency there.

To make this more concrete:

    let s = String::from("s");
    
    let closure = || {
        println!("s is: {}", s);
    };
    
    closure();
If you wanted to capture s in a different way:

    let s = String::from("s");
    
    let s1 = &s;
    let closure = || {
        println!("s1 is: {}", s1);
    };
    
    closure();
No capture list needed, same control.


Rust doesn't have explicit capture lists, but it does have the `move` modifier on closures which is like C++'s `[=]`.

Strictly speaking, Rust probably could have gotten away with having neither the `move` modifier nor capture clauses at all, but it would have had wide-ranging implications on the ergonomics and capabilty of closures.


How would you do the equivalent of this in Rust?

  auto on_heap = std::make_unique<MyType>(...);

  function_that_accepts_lambda([obj = std::move(on_heap)]() {
     obj->bar(...);
  })


    let on_heap = Box::new(...);

    function_that_accepts_lambda(move || {
        on_heap.bar();
    });
This is sort of what kibwen was mentioning: move is a single annotation that overrides everything to capture by value rather than have it inferred.


What if you want to move some things, but copy others?

e.g.

  auto shared = std::make_shared<MySharedType>(...);
  auto unique = std::make_unique<MyOwnedType>(...);

  function_that_accepts_lambda([shared, u = std::move(unique)] {
      shared->foo(...); u->bar(...);
  });

  // Outer scope can still use shared.
  shared->foo(....);


I am 99% sure this is identical:

    let on_heap = Box::new(...);
    let shared = Arc::new(...);

    let s = shared.clone();
    function_that_accepts_lambda(move || {
        on_heap.bar();
        s.foo();
    });
We have to make the extra s binding.

Also, my sibling is correct that Copy types will just be copied, not moved.


If the type implements Copy they'll be implicitly copied when moved into the Rust closure(I think). Or you can declare a scope var and clone() manually.


The syntax is not the prettiest, but it is legible once you understand what [](){} means.

In C#, there is no such thing, but there is a part of me that wishes we had such a thing. I like the ability explicitly state what variables are being captured.


> I like the ability explicitly state what variables are being captured.

Why? You state what variables are being captured by just using them in the lambda body.


> You state what variables are being captured by just using them in the lambda body.

Wow, have you never spent a week debugging a JavaScript memory leak?


No. What I have done on the other hand was add unused lexical variables to an anonymous function so the runtime wouldn't optimise them out of the closure and I could still see them in the debugger.


I'm not 100% sure, but C#'s compiler should automatically capture what you need (and leave out the rest).

I think the primary need for manual declaration is because in C++ you need to differentiate between pass by copy semantics and pass by reference semantics.


> I think the primary need for manual declaration is because in C++ you need to differentiate between pass by copy semantics and pass by reference semantics.

That's not actually a need, C++ includes [=] and [&] (capture everything by value or by reference). You can get a mix by creating references outside the body then capturing the environment by value (capturing the references by value and thus getting references).

On the one hand it has a bit more syntactic overhead (you have to take and declare a bunch of references before creating the closure), on the other hand there's less irregularity to the language, and bindings mean the same thing in and out of the closure.

FWIW that's what Rust does[0], though it may help that Rust's blocks are "statement expressions", some constructs would probably be unwieldy without that.

[0] the default corresponds to C++'s [&] (capture by ref), and a "move closure" switches to [=] instead


Yep, in Microsoft’s C# compiler, only the closed over local variables of a function are captured (which, in C#’s case, means generating a class with fields corresponding to each closed over local, and then replacing those locals with references to their respective fields of an instance of that class).


PHP also doesn't capture variables by default (except $this) and i like it that way.


The only thing I find somewhat frustrating about the syntax is that the notation messes with my existing expectations. Up until now, in C-like languages a [] was just for collections and indexing into them, in the code I used at least.

I mean I'm not really complaining; I don't see better syntax to fit short anonymous functions into the existing syntax, without defeating the whole purpose of it either.

I suspect it's just a matter of getting used to this extra meaning for angular brackets.


Very often the capture list is empty, it could have been elided (as the parameter list can be) if the syntax could have been made unambiguous.


Well, you need something to indicate the beginning of a lambda. So you can think of "[]" as serving that role, instead of "lambda" in Python or "\" in Haskell.


You can use C# fat arrow syntax, it even allows removing the braces for single expression lambdas which is the most common from anyway.


> removing the braces

If I have learned anything over the years, it's that removing the braces anywhere is the introduction of a bug during a rewrite waiting to happen.


It really works without any issues in C# from my experience.

Statemends like :

    list.Where(e => e.Property == ExpectedValue).Select(e => e.Property)
is much cleaner than something like :

    list.Where([](auto e) { return e.Property == ExpectedValue; }).Select([](auto e) { return e.Property; });
Braces and capture declaration adds 0 value here and it's ~80% of the use cases I see for lambdas.


For historical interest, compare an early proposal: http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2006/n195...


What would you propose as the syntax?


Apple extended C with block closures years ago.

  int b = 0;

  ^(int a) {
    return a*b;
  }
The declaration for lambda variables is almost identical to function pointers, just with a ^ instead of a *, so there's nothing to learn (or unlearn, like C++ forces you to). The ^ looks like a lambda, and historically the lambda of lambda calculus actually was a caret accent over the variable. The argument list can be elided.


Looking here (https://developer.apple.com/library/ios/documentation/Cocoa/...) for the details, I think this is going to largely be a matter of opinion. I prefer the C++ syntax, especially when it comes to capture by reference, for which the Apple syntax seems to require the __block storage type modifier.


And this is why C++ couldn't use this notation, as they didn't want to break compability with this extension of Apple's.


Caret as the embryonic form of lambda is apparently a myth propagated by Barendregt, and lambda is just a random Greek letter to go with alpha, beta, and eta.

http://researchblogs.cs.bham.ac.uk/thelablunch/2016/05/why-i...


I honestly would have much preferred they had a keyword so that it kinda matched the rest of the language. It feels a bit tacked on and hard to parse.

Something like:

lambda(arguments):capture list {...}

Just seems way more clear. Looks more like a function or a class (and a lambda is sort of an in-between kind of thing anyway)


Agreed. From the template library on down, it seems like the C++ community is hellbent on making the syntax for what should be clean, common operations seem like arcane Sanskrit. I dont know what their problem is.


Often, the problem is that the clean simple syntax you might want to use already means something else in C++, and the bias against breaking existing code is very strong.


Refusing to break backwards compatibility is their problem. I respect them for that; if you do want to break it make another language that plays nicely with C++ instead.


This is the first time I've looked at C++ lambdas. They appear magnificently powerful and also like another pile of easy ways to get completely screwed up.

Ah well, that's just the C++ way I suppose.

Makes me glad for Rust, that's for sure!


Here take your rustwin point!

I really like the explicit capture of C++'s lambdas more than the implicit one in most other languages (C#, Java, Python...) where you easily ends-up with a closure not referencing the expected variable. See: https://blogs.msdn.microsoft.com/ericlippert/2009/11/12/clos...


I was burned by the same thing in Javascript.

"Explicit is better than implicit". Therefore, I agree with you that explicit closure list, with the ability to copy and reference captured variables, is actually what C++ does right, not wrong.


The explicit capture list is only necessary in C++ because memory ownership and lifetimes are managed by the programmer in C++. Compare that to a garbage-collected language like Scheme or C#: when the implementation can figure out where memory needs to be freed and ensures you can't use-after-free, it frees the programmer from thinking about ownership (but not necessarily lifetimes: you can still wind up with memory leaks in GC'd languages if you're not careful to let go of references you no longer need). As mentioned elsewhere in this thread, Rust also offers the same level of explicit control without capture lists (though I'm on the fence about which way I prefer).

My point is that in languages with automatic memory management, explicit capture lists don't make much sense because the programmer is not tasked with managing memory and can safely capture references all the time. There's no need to ask oneself, "Do I own this pointed-to memory? Do I need to worry about it being freed before this closure? Should I make a copy?", etc. This is because, in a sense, the garbage collector itself owns the memory, but checks to make sure nothing else can use it anymore before it frees it.


You only talk about the memory management part and I guess most language designers think the same. What you and they fail to account for is that explicit capture list can reduce logical bugs.

For one, if I were allowed to explicitly capture the counter variable by copy, the surprising behavior mentioned above would never occur. In languages with mutability, the ability to make some part immutable is a virtue.

For two, in languages without explicit variable declaration, which variable is defined where quickly becomes murky when you have implicit capture. I have so many frustrations where the inner `i` variable clashes with the outer `i` in Python. Yes, I could just use a different name, but naming is hard, and with a new scope I should be able to reuse the name. That is almost the whole point of opening a new scope!

For three, in Javascript where closures are everywhere due to the amount of callbacks, the reference graph is just impossible to analyze. A closure may closes over another closure which closes over an object with a reference to the original closure. An explicit capture list makes the programmer think, and ease the job of anyone who tries to spot memory leaks from the source code. (But I guess that is just not the Javascript style, as they are so fond of never letting the programmers know about their mistakes. At least in C++ we trade that for speed. I don't know what Javascript trades that for.)


> You only talk about the memory management part and I guess most language designers think the same. What you and they fail to account for is that explicit capture list can reduce logical bugs.

I suppose, as a language designer, I tend to think that the more I do automatically, the more I ease the programmer's burden. However, as you point out, that's not always true. That said, my point wasn't (isn't?) that explicit capture is only a good idea sans automatic memory management (it may well be -- you've certainly given me some food for thought here), but rather that it's only necessary in that case, and I think that point still stands.

> For one, if I were allowed to explicitly capture the counter variable by copy, the surprising behavior mentioned above would never occur.

That's a failure of language design and I don't think the proper solution is to force explicit capture on closure creation (also note that you need more than just explicit capture because to prevent such an error, you need the ability to specify that the "captured" variable ought to be copied rather than actually captured). I think the proper solution to that problem is the one that the C# team went with: limit the scope of iteration control variables to the iterated block. This is typically what programmers used to block-structured languages would expect, anyway, unless the variable were clearly declared outside the scope of the iteration.

> In languages with mutability, the ability to make some part immutable is a virtue.

That's an orthogonal issue, and can be done in many other (and more general) ways.

> For two, in languages without explicit variable declaration, which variable is defined where quickly becomes murky when you have implicit capture. I have so many frustrations where the inner `i` variable clashes with the outer `i` in Python. Yes, I could just use a different name, but naming is hard, and with a new scope I should be able to reuse the name. That is almost the whole point of opening a new scope!

You're right: that is the point of opening a new scope! That sounds like a flaw in Python's design and could be remedied by making variable definition syntax different from assignment syntax. Consider Lua with its `local` syntax, C and kin with their type annotations, the Lisps with their completely separate forms for variable definition and assignment, and so on. There's also the Tcl strategy of "it's a definition unless it was imported into this scope with `global` or `upval`; otherwise it's an assignment".

> For three, in Javascript where closures are everywhere due to the amount of callbacks, the reference graph is just impossible to analyze. A closure may closes over another closure which closes over an object with a reference to the original closure. An explicit capture list makes the programmer think, and ease the job of anyone who tries to spot memory leaks from the source code. (But I guess that is just not the Javascript style, as they are so fond of never letting the programmers know about their mistakes. At least in C++ we trade that for speed. I don't know what Javascript trades that for.)

JavaScript is a shitty language to begin with, and fixing it wouldn't be as simple as fixing C# or Python... You make a good point here, but I still think that better tooling for data-flow analysis is a more attractive choice than a compulsory explicit capture list. On the flip side, an optional capture list could be a good compromise.


> On the flip side, an optional capture list could be a good compromise.

That is exactly what I am thinking about. Or, rather, what C++ has done right: You can let the compiler infer what to capture, like [=] or [&], or you can explicitly list the variables to capture.

> you need the ability to specify that the "captured" variable ought to be copied rather than actually captured

Yes, that is what I am talking about, and again, what C++ has done right. Most other languages give you no choice whether the capture is by copy or by reference.


Ah, then I see we're in agreement :)

> Most other languages give you no choice whether the capture is by copy or by reference.

That's because in languages that have traditionally had GC (i.e., languages in the Lisp tradition or in the ML tradition), the distinction didn't matter. Those languages did not "suffer" from a value/reference dichotomy (e.g., in Scheme, you're literally capturing the variable rather than a copy or reference to the value stored within -- under the hood, that variable might always store a reference for convenience, or it might store a value for performance, but it doesn't matter as it's strictly an implementation detail).

I'm glad that the C++ committee didn't just dump closures into the language without considering this sort of interaction with other aspects of the language. Without the capture lists, closures in C++ have the potential to really suck. That the explicit capture lists even exist is evidence that they've carefully considered how the new features are going to play with existing characteristics of C++. Kudos to them for that!


That is almost true, but there's one exception in those GC'ed languages due to the dichotomy of value types and reference types. The confusing behavior on capturing the iteration variable is one example.


Ah, yes! You're correct. I spend most of my GC'd time in languages that don't have such a value vs. reference dichotomy, and I'd completely forgotten about it.


It was enough of a problem in Scala to warrant this though http://docs.scala-lang.org/sips/pending/spores.html


Spores seem like an interesting solution. The language designer in me has a distaste for it, though :p

For case 1 (capture of mutable references), an explicit copy operator might be better (as in, "I want whatever value this variable is bound to, rather than the storage location") (or even vice versa, where value is the default and there's an operator for location). In a way, spores accomplish this by forcing you to do the copy manually -- but then programmers have to always remember to use the extra syntax, and they need to do it for every captured variable. I'm not quite happy with even this solution, and it may be possible to come up with something even better. Concurrency is always a can o' worms :)

For case 2 (capture of implicit "this"), I'd argue that if (a) the compiler is smart enough to know that "helper" is implicitly "this.helper" and (b) that "this" will be captured by the closure, then (c) the compiler is also smart enough to create an implicit binding for "helper" and capture that instead. This would lead to less-surprising behavior, and intentional capture of "this" could still be done via explicit access. Another option is to, rather than treating "this" as being in an enclosing scope, treat it as though it were an implicit argument to the method (albeit a covariant one). This avoids capture altogether.


Agree on the copy operator, not only for spores, have wanted it more than one time in other languages too.

Not sure how the this binding should work though. If calling a method you need to a) dispatch on the runtime type and b) provide the instance to the method when called.


The compiler would essentially emit the same code that it would in the case of the spore, but it would be automatic. You still get to dispatch on the runtime type, because the binding is created after the method invocation, but before the scope of the lambda to be closed.

I think when a programmer writes "foo.combobulate()", the vast majority of the time, the intend to capture "foo". If they didn't and were being clever, I don't think it's unreasonable for the compiler to expect them to be explicit and write "this.foo.combobulate()" instead. In the former case, the compiler creates the implicit binding to capture, in the latter it does nothing implicit and just closes over "this".

I'm certain that the compiler has enough information to do this, and that it's in accordance with the principle of least surprise ;)


What do you mean? Only time you can mess up a lambda is if a pointer that you're using gets changed. And this kind of dupicate ownership is a general problem with pointers.


I think you have issues with reference capture groups and object lifetimes. Or alternatively value capture groups and object slicing.


Yeah, I'm not following as well. I've used Lambdas in C++ pretty heavily and they're really well done and thought out.

FWIW nothing about them is much different from Rust's approach(anon struct + fn) it's all the ownership guarantees which give Rust it's safety.


A question people who use C++ regularly, is C++ becoming easier to read and code?


Yes, I work in a template heavy C++ codebase and the (already quite good) situation is getting better with each language standard.

C++11 was really a turning point for the language - features like 'auto', lambdas, and variadic templates have enabled succinct generic code that is both readable and highly performant.

Increased competition between the gcc and clang teams has also been a major improvement - both have implemented many C++17 features very quickly, and error messages have greatly improved in both compilers. This is especially welcome when developing templates. Clang's licensing has made it possible to integrate libclang in to vim/emacs (ycmd, irony-mode, rtags, etc) for very accurate completion/syntax checking, etc. Clang-format has also seen quite a bit of adoption, bringing the benefits of standardized formatting to large projects.

The sanitizers have also been a huge boon - getting automatic memory leak, buffer/heap overflow, use after free, uninitialized memory, integer overflow, etc is now as easy as compiling with '-fsanitize=[address|undefined|memory|etc]'.

Overall, C++(11+) is a very productive language if you have stringent performance and latency requirements and you need powerful abstraction facilities.


Absolutely, and without a doubt.

* `unique_ptr` as a local variable. Before C++11, I needed to either (a) define a holder class for anything that should be deleted at the end of a scope or (b) delete it manually and pray that there isn't an exception thrown. Now, I can just declare it, and trust the destructor to clean up after me.

* `unique_ptr` as a return value. Previously, if a function returns a pointer, there was no way on knowing who was responsible for calling `delete`. Now, I can clearly indicate intent. `unique_ptr` means that the caller now owns the object, while C-style pointer or reference means that the callee still owns the object.

* With lambda statements, I can call `std::sort` in-place, with the sorting criteria immediately visible. Previously, I would need to define a function elsewhere in the code, obscuring what may be a simple `a.param < b.param`.

* With range-based for loops, I can loop over any container without needing the very long `std::vector<MyClassName>::iterator` declaration.

* `= delete` to remove an automatically generated method, such as copy constructors. Previously, you would declare that method to be private, then never make an implementation of it. `= delete` shows your intent much more clearly.

* `static_assert`, so that you can bail out of templates earlier, and with reasonable error messages.

* Variadic templates. These aren't needed in 99% of cases, but they are incredibly useful when designing libraries.

* `std::thread` No more messing around with different thread libraries depending on which platform you are on.


Re unique_ptr as a local variable: in C++98 you can use auto_ptr.


True, I didn't mention it, because it has its own issues. The move-on-copy semantics of auto_ptr makes it incompatible with std containers, and makes for some rather unexpected behavior.


It does get easier to read and code over time. As an aside, the C++ in this post is pretty much as elegant as it gets.


Do you mean, becoming easier with new language features that get added?


Contrary to popular belief, new features can make a language more elegant and simple, if they make clunky old features obsolete with a simpler alternative.


Clang's C++1z support: http://clang.llvm.org/cxx_status.html (scroll down a bit)

And gcc's: https://gcc.gnu.org/projects/cxx-status.html#cxx1z

Unfortunately neither supports constexpr lambdas at the moment.


One aspect of C++ lambdas that I really don’t like is the visual confusion caused by allowing "return", since at a glance this seems to affect the parent function. I have already found myself adding comments inside lambdas like "return x; // return-from-lambda" to make sure that I see what is really happening. Python by contrast does two things better: Python makes it really hard to write long expressions as lambdas, and no "return" is used in a Python "lambda". Of course, Python also allows "def" inside a "def" as a convenient way to write longer one-time functions.

I also found that while I could use C++ lambdas for things like iteration, e.g. "object->forEachThing([](Thing const& t, bool& stop){ ... })", this makes the keyword problem worse. In this type of call, if I want to implement something that is logically like a "break" or "continue" of the loop, it has to use the "return" keyword (from the lambda only) with special conditions attached such as a "bool" variable to request the break. And that is confusing to read, even though conceptually it is similar to the Objective-C NSArray "enumerateObjectsUsingBlock:" that takes a similar approach (in that the block takes a "stop" argument).


Not a bad article. I wish the first example wasn't so complicated. C++ lambda syntax is pretty gross. The initial breakdown is great. But why use a std::vector and std::transform in the first real example? Stick to integer addition. Keep things simple.


There is another interesting feature in C++17 for lambdas:

Possibility to cast a lambda to a function pointer.

It will become possible to store a lambda as a struct/member that could bind to "this" (like in Javascript).


I assume this is only for stateless lambdas, that have not captured anything? Because otherwise a function pointer would not be enough, right?

Currently, any stateless lambda can decompose into a function pointer and be passed to any function that expects a function pointer, right?

Are you saying that ++17 has augmented this, and if so can you provide more details or a reference (or an example) as I'm quite curious to know more.


https://isocpp.org/files/papers/p0018r3.html

It's the best reference I found. The paper only talks of capturing "*this" by value (as in the original post of the topic).

I think I read that in a draft about coroutines. The idea was to capture "this" by reference and convert the lambda to a function pointer to make it movable.

It would useful for delegates or signals too.


Nice. I am a bit confused -- before c++17 it is impossible to capture this by copy; is it possible to capture other objects by copy? seems like it is?


Yes, it is. Just write:

  [a](...){ ... }
and the variable "a" will be captured by copy.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: