In C++11 lambda syntax, heap-allocated closures?

Question

C++11 lambdas are great!

But one thing is missing, which is how to safely deal with mutable data.

The following will give bad counts after the first count:

#include <cstdio>
#include <functional>
#include <memory>

std::function<int(void)> f1()
{
    int k = 121;
    return std::function<int(void)>([&]{return k++;});
}

int main()
{
    int j = 50;
    auto g = f1();
    printf("%d\n", g());
    printf("%d\n", g());
    printf("%d\n", g());
    printf("%d\n", g());
}

gives,

$ g++-4.5 -std=c++0x -o test test.cpp && ./test
121
8365280
8365280
8365280

The reason is that after f1() returns, k is out of scope but still on the stack. So the first time g() is executed k is fine, but after that the stack is corrupted and k loses its value.

So, the only way I've managed to make safely returnable closures in C++11 is to allocate closed variables explicitly on the heap:

std::function<int(void)> f2()
{
    int k = 121;
    std::shared_ptr<int> o = std::shared_ptr<int>(new int(k));
    return std::function<int(void)>([=]{return (*o)++;});
}

int main()
{
    int j = 50;
auto g = f2();
    printf("%d\n", g());
    printf("%d\n", g());
    printf("%d\n", g());
    printf("%d\n", g());
}

Here, [=] is used to ensure the shared pointer is copied, not referenced, so that memory handling is done correctly: the heap-allocated copy of k should be freed when the generated function g goes out of scope. The result is as desired,

$ g++-4.5 -std=c++0x -o test test.cpp && ./test
121
122
123
124

It's pretty ugly to refer to variables by dereferencing them, but it should be possible to use references instead:

std::function<int(void)> f3()
{
    int k = 121;
    std::shared_ptr<int> o = std::shared_ptr<int>(new int(k));
    int &p = *o;
    return std::function<int(void)>([&]{return p++;});
}

Actually, this oddly gives me,

$ g++-4.5 -std=c++0x -o test test.cpp && ./test
0
1
2
3

Any idea why? Maybe it's not polite to take a reference of a shared pointer, now that I think about it, since it's not a tracked reference. I found that moving the reference to inside the lambda causes a crash,

std::function<int(void)> f4()
{
    int k = 121;
std::shared_ptr<int> o = std::shared_ptr<int>(new int(k));
    return std::function<int(void)>([&]{int &p = *o; return p++;});
}

giving,

g++-4.5 -std=c++0x -o test test.cpp && ./test
156565552
/bin/bash: line 1: 25219 Segmentation fault      ./test

In any case, it would be nice if there was a way to automatically make safely returnable closures via heap allocation. For example, if there was an alternative to [=] and [&] that indicated that variables should be heap allocated and referenced via references to shared pointers. My initial thought when I learned about std::function was that it creates an object encapsulating the closure, therefore it could provide storage for the closure environment, but my experiments show that this doesn't seem to help.

I think safely returnable closures in C++11 are going to be paramount to using them, does anyone know how this can be accomplished more elegantly?

You should run tests like this inside Valgrind, because you're seeing by-chance correct behavior when accessing deallocated memory. — Potatoswatter, May 20 '12 at 01:49

bames53 · Accepted Answer · 2012-05-21T14:31:47.567

25

In f1 you're getting undefined behavior for the reason you say; the lambda contains a reference to a local variable, and after the function returns the reference is no longer valid. To get around this you don't have to allocate on the heap, you simply have to declare that captured values are mutable:

int k = 121;
return std::function<int(void)>([=]() mutable {return k++;});

You will have to be careful about using this lambda though, because different copies of it will be modifying their own copy of the captured variable. Often algorithms expect that using a copy of a functor is equivalent to using the original. I think there's only one algorithm that actually makes allowances for a stateful function object, std::for_each, where it returns another copy of the function object it uses so you can access whatever modifications occurred.

In f3 nothing is maintaining a copy of the shared pointer, so the memory is being freed and accessing it gives undefined behavior. You can fix this by explicitly capturing the shared pointer by value and still capture the pointed-to int by reference.

std::shared_ptr<int> o = std::shared_ptr<int>(new int(k));
int &p = *o;
return std::function<int(void)>([&p,o]{return p++;});

f4 is again undefined behavior because you're again capturing a reference to a local variable, o. You should simply capture by value but then still create your int &p inside the lambda in order to get the syntax you want.

std::shared_ptr<int> o = std::shared_ptr<int>(new int(k));
return std::function<int(void)>([o]() -> int {int &p = *o; return p++;});

Note that when you add the second statement C++11 no longer allows you to omit the return type. (clang and I assume gcc have an extension that allows return type deduction even with multiple statement, but you should get a warning at least.)

edited May 21 '12 at 14:31

answered May 20 '12 at 01:33

bames53

86,085
15
179
244

Note that it's harder to access state in a lambda than an old-style functor. The only thing you can do is call it again. But stateful lambdas are still useful even when they don't return auxiliary results. – Potatoswatter May 20 '12 at 01:57
"*I think there's only one algorithm that actually makes allowances for a stateful lambda, std::accumulate, where it returns the lambda it uses so you can access whatever modifications occurred.*" That's `std::for_each` you're thinking of. – ildjarn May 21 '12 at 14:24
If different instances of a closure in the same scope get their own copy of the variable, that variable is not captured!!! – Kaz Feb 23 '13 at 01:32
@Kaz IIUC, you are saying that `[&]` lambdas are proper closures, while `[=]` are not. Why? Is it because other languages usually define closures that reference (rather than copy) the outer environment? – max Aug 27 '17 at 20:13
```std::function([&p,o]{return p++;});``` .. why is it okay for the reference to go out of scope in this case? – user3882729 Apr 10 '22 at 02:56
1

@user3882729 The name `p` (note I'm not calling it a variable or an object; it can be tricky to talk about references in C++) goes out of scope, but the object referenced by that name remains valid because that object's lifetime will continue as long as any _shared_ptr_ keeps it alive (such as the copy of `o` that the lambda also captures). The lambda _captures-by-reference_ the object that `p` refers to, but doesn't care about the _name_ `p` itself. The line you quote is effectively the same as this use of a C++14 capture initializer: `std::function([&p = *o, o]{return p++;});`. – bames53 Apr 11 '22 at 04:12

score 2 · Answer 2 · answered Jun 03 '13 at 06:13

Here's my testing code. It uses recursive function to compare the address of lambda parameters with other stack based variables.

#include <stdio.h> 
#include <functional> 

void fun2( std::function<void()> callback ) { 
    (callback)(); 
} 

void fun1(int n) { 
    if(n <= 0) return; 
    printf("stack address = %p, ", &n); 

    fun2([n]() { 
        printf("capture address = %p\n", &n); 
        fun1(n - 1); 
    }); 
} 

int main() { 
    fun1(200); 
    return 0; 
}

Compile the code with mingw64 and runs on Win7, it output

stack address = 000000000022F1E0, capture address = 00000000002F6D20
stack address = 000000000022F0C0, capture address = 00000000002F6D40
stack address = 000000000022EFA0, capture address = 00000000002F6D60
stack address = 000000000022EE80, capture address = 00000000002F6D80
stack address = 000000000022ED60, capture address = 00000000002F6DA0
stack address = 000000000022EC40, capture address = 00000000002F6DC0
stack address = 000000000022EB20, capture address = 00000000007A7810
stack address = 000000000022EA00, capture address = 00000000007A7820
stack address = 000000000022E8E0, capture address = 00000000007A7830
stack address = 000000000022E7C0, capture address = 00000000007A7840

It's obvious that the captured parameter is not located on stack area, and the address of captured parameters is not continuous.

So I believe that some compilers may use dynamical memory allocation to
capture lambda parameters.

Lambda objects are simply syntactic sugar for creating normal function objects. Captured variables become members of the function object. Dynamic memory allocation is not used unless you specifically write the code to do that. — bames53, Jun 03 '13 at 14:17
@bames53 according to this article: http://www.drdobbs.com/cpp/efficient-use-of-lambda-expressions-and/232500059?pgno=2, std::function implementation is actually allowed to do dynamic allocations. — marcinj, Nov 28 '14 at 19:44
@brightstar I'm only talking about the compiler's implementation of lambda objects. The above answer misattributes dynamic allocations in the `std::function<>` implementation to the compiler's implementation of the lambda object. — bames53, Nov 28 '14 at 20:38
Thank you, bames! I fianally realized that it's std::function<> allocate the memory dynamically, not lambda. — xhawk18, Mar 02 '15 at 07:10

In C++11 lambda syntax, heap-allocated closures?

2 Answers2

Linked