1

I am working on a data aggregator implementation with nested lambda expressions. I am not very experienced with lambda functions yet and I am not 100% sure whether my implementation idea is implementable.

The Problem: I have a Multivalue class which has a vector as a private member. The data type of this private vector can be defined via a template parameter of the Multivalue class. I want to offer different aggregation functions (sum, harmonic mean, average, ...) for the data in the vector. BUT: if the vector data type is a tuple, the aggregation functions should also be available for the components of the tuple. The following link shows my first try, which is not working yet. I hope someone can explain to me what the problem is. I think there is a problem with the nested lambda expressions, I use:

http://goo.gl/w7MUAi

#include <iostream>
#include <vector>
#include <functional>
#include <tuple>

using namespace std;

template<typename ... TTypes>
std::ostream& operator<<(std::ostream& out, const std::tuple<TTypes...>& value) { return out; }

/** our general abstract aggregation object */
template<typename T, typename C>
class AbstractAggregator {
public: 
    AbstractAggregator() { }
    AbstractAggregator(const std::vector<C> *data, std::function<const T&(const C&)> access) :
    data(data), access(access) { }

    T sum() const {
        T sum;
        for (auto &i : *data)
            sum += access(i);
        return sum;
    }
protected:
    const std::vector<C> *data;
    std::function<const T&(const C&)> access;
};

/** concrete aggregation implementation for types like int, float, double .. */
template<typename T, typename C>
class Aggregator : public AbstractAggregator<T, C> {
public:
    Aggregator() { }
    Aggregator(const std::vector<C> *data, std::function<const T&(const C&)> access) : AbstractAggregator<T, C>(data, access) { }    
};

/** aggregator implementation for tuple (with subaggregators for each component */
template<typename ... TTypes, typename C>
class Aggregator<std::tuple<TTypes...>, C> : public AbstractAggregator<std::tuple<TTypes...>, C> {
public:
    Aggregator() { }
    Aggregator(const std::vector<C> *data, std::function<const std::tuple<TTypes...>&(const C&)> access) : AbstractAggregator<std::tuple<TTypes...>, C>(data, access) { 
    initSubAggregators<sizeof...(TTypes), TTypes...>(access); 
}

    std::tuple<Aggregator<TTypes, C>...> subaggregators;
private:
    template<int N>
    void initSubAggregators(std::function<const std::tuple<TTypes...>&(const C&)> access) { }
    template<int N, typename THead, typename... TTail>
    void initSubAggregators(std::function<const std::tuple<TTypes...>&(const C&)> access) { 
        constexpr int I = N - sizeof...(TTail) - 1;
        std::get<I>(subaggregators) = Aggregator<THead, C>(AbstractAggregator<std::tuple<TTypes...>, C>::data, [&](const C &value) { return std::get<I>(access(value)); });        
        initSubAggregators<N, TTail...>(access);
    }
};

namespace std {
    template<size_t I, typename ... TTypes, typename C>
    auto get(Aggregator<std::tuple<TTypes...>, C>& k) -> decltype(std::get<I>(k.subaggregators)) {
        return std::get<I>(k.subaggregators);
    }

    template<size_t I, typename ... TTypes, typename C>
    auto get(const Aggregator<std::tuple<TTypes...>, C>& k) -> decltype(std::get<I>(k.subaggregators)) {
        return std::get<I>(k.subaggregators);
    }
}

/** multivalue attribute implementation (inherits corresponding aggregation object) */
template<typename T>
class Multivalue : public Aggregator<T, T> {
public:
    Multivalue() : Aggregator<T, T>(&data, [](const T &value) { return value; }) { }
    void append(const T& item) { data.push_back(item); }
private:
    std::vector<T> data;
};

int main() {
    Multivalue<std::tuple<std::tuple<uint32_t, uint32_t>, float, double>> mv;
    mv.append(std::make_tuple(std::make_tuple(13, 12), 2.5, 3.5));
    mv.append(std::make_tuple(std::make_tuple(1, 7), 2.55123, 1.5));
    mv.append(std::make_tuple(std::make_tuple(5, 3), 2.312, 1.8));

    auto &a1 = std::get<2>(mv);
    auto &a2 = std::get<1>(mv);
    auto &a3 = std::get<0>(std::get<0>(mv));
    auto &a4 = std::get<1>(std::get<0>(mv));

    std::cout << "Sum 1: " << a1.sum() << std::endl;
    std::cout << "Sum 2: " << a2.sum() << std::endl;
    std::cout << "Sum 3: " << a3.sum() << std::endl;
    std::cout << "Sum 4: " << a4.sum() << std::endl;

    return 0;
}

Moo

moo
  • 486
  • 8
  • 22
  • If the code base is not huge, please post it here. If it is huge, create an [MCVE](http://stackoverflow.com/help/mcve) and post the MCVE here. – R Sahu Apr 13 '15 at 21:02
  • Have you seen the link, I have posted? I think it is already a MCVE example. – moo Apr 13 '15 at 21:04
  • No, I didn't see it. We like to see the code posted here. – R Sahu Apr 13 '15 at 21:05
  • 1
    I think re-implementing something inside std namespace is a recipe for disaster. – Ilya Kobelevskiy Apr 13 '15 at 21:14
  • Reimplement? I add get functionality for my aggregator objects. I think that's comparable to defining std::hash for your own class types, isn't it? – moo Apr 13 '15 at 21:15
  • 1
    See http://stackoverflow.com/questions/16173902/enable-stdget-support-on-class – Ilya Kobelevskiy Apr 13 '15 at 21:23
  • Okay, valid point. I will change that by creating an own getter. However, I already verified that the std::get is not the problem. The problem is somehow related to the line sum += access(i);. – moo Apr 13 '15 at 21:25
  • In `initSubAggregators`, you're capturing the function parameter **by reference**. That parameter goes out of scope, is destroyed, yet you're accessing it later. – dyp Apr 13 '15 at 22:03
  • By reference? Where? In *initSubAggregators(std::function&(const C&)> access)*, access isn't a reference, is it? – moo Apr 13 '15 at 22:06
  • No, that function parameter is *not* a reference itself. A reference would look like this: `std::function&(const&)> const& access` (with or without the `const` prior to the last `&`). Btw, I think some typedefs could simplify the code and make it easier to read. – dyp Apr 13 '15 at 22:13
  • What did you mean in your prior comment then? I will try to add some typdefs later. – moo Apr 13 '15 at 22:15
  • The function parameter is not a reference. The lambda captures it by reference. Hence, the lambda captures a "local variable" (a function parameter) by reference. This "local variable" goes out of scope at the end of the function. The lambda however lives on, and still contains a reference to the now nonexisting `access` object. When you call the lambda later, it will use this dangling reference, this leads to a segfault. – dyp Apr 13 '15 at 22:26
  • Creating the lambda the following way should fix it, shouldn't it? *[access](const C &value) { return std::get(access(value)); }* But my code still doesn't work. – moo Apr 13 '15 at 22:30
  • 1
    It should not segfault any more (at least, it doesn't on coliru). The remaining issues are 1) the one pointed out by vsoftco. 2) the identity-lambda causes UB as well: `[](const T &value) { return value; }` should be `[](const T &value) -> T const& { return value; }` It currently returns by value, and the `std::function` that stores it returns that (temporary) value by reference. – dyp Apr 13 '15 at 22:34
  • @dyp: perfect, it works. Do you want to create an answer for this? I would like to give you an upvote. – moo Apr 13 '15 at 22:42

1 Answers1

4

There are at least three issues in your code.


The first has been pointed out by vsoftco, in a now deleted answer:

In the summation function, you're reading from an uninitialized variable.

T sum() const {
    T sum; // uninitialized
    for (auto &i : *data)
    {
        std::cout << "summing from: " << i << "\n";
        sum += access(i);
    }
    return sum;
}

To be a bit more precise, the variable sum is default-initialized, but for non-class types, this implies no initialization at all.

A simple solution is to value-initialize sum:

T sum{}; // value-initialized

The second issue is a lifetime issue with a captured entity:

template<int N, typename THead, typename... TTail>
void initSubAggregators(std::function<const std::tuple<TTypes...>&(const C&)> access) { 
    constexpr int I = N - sizeof...(TTail) - 1;
    std::get<I>(subaggregators) = Aggregator<THead, C>(AbstractAggregator<std::tuple<TTypes...>, C>::data, [&](const C &value) { return std::get<I>(access(value)); });        
    initSubAggregators<N, TTail...>(access);
}

Since I find this rather hard to read, let's introduce some typedefs and other simplifications:

using R = std::tuple<TTypes...> const&;
using P = C const&;
using F = std::function<R(P)>;

template<int N, typename THead, typename... TTail>
void initSubAggregators(F access) {
    constexpr int I = N - sizeof...(TTail) - 1;
    auto l = [&](C const& value) { return std::get<I>(access(value)); };
    std::get<I>(subaggregators) = Aggregator<THead, C>(this->data, l);
    initSubAggregators<N, TTail...>(access);
}

The lamdba in the function initSubAggregators captures the function parameter access by reference. The function parameter is an object that goes out of scope at the end of the function, and is subsequently destroyed. The lambda however is stored within a std::function inside the sub-aggregator and lives beyond the scope of initSubAggregators. Since it captured access by reference, it will store a dangling reference after initSubAggregators returns. One possible solution is to store access by value (which creates a copy for each sub-aggregator). Another solution is to store access within the tuple-aggregator as a data member, and store a reference to that data member within each lambda.


The third issue is also related to lifetime of an object.

The lambda that is created in the constructor of Multivalue and passed to the aggregator returns by value due to the normal return type deduction:

Multivalue() : Aggregator<T, T>(&data, [](const T &value) { return value; }) { }

The constructor of Aggregator<T, T> however expects a function object that returns by const-reference:

Aggregator(const std::vector<C> *data, std::function<const T&(const C&)> access)

When the std::function object is called, it calls the lambda object. The lambda returns a T by value. std::function then returns that temporary object (the return value) by reference, creating a dangling reference. This is the same issue as in What is the return type of a lambda expression if an item of a vector is returned? It can be solved by manually defining the return type:

Multivalue() : Aggregator<T, T>(&data, [](const T &value) -> const T & { return value; }) { }
Community
  • 1
  • 1
dyp
  • 38,334
  • 13
  • 112
  • 177
  • 1
    There are probably better solutions to these problems, but I'm not sure if they're worth it. I'd consider redesigning the classes instead. – dyp Apr 13 '15 at 23:39
  • `std::function< T&(???) >` should really be spewing errors if you pass it a function that returns by value, as it is returning a reference to a temporary. Wonder why not? Maybe it is stored in an intermediate reference parameter before returning, and that confuses the compiler? – Yakk - Adam Nevraumont Apr 14 '15 at 14:18
  • @Yakk I agree. I think the reason why it doesn't emit a warning is that the header containing the source of the lifetime issue is defined as GCC system_header via a pragma. In libc++, header __functional_base, if I copy the relevant `__invoke_void_return_wrapper` to a custom header, I get the warning. Similarly, if I deactivate the pragma via `_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER`. [Jonathan Wakely's comment here](http://stackoverflow.com/questions/29610318/what-is-the-return-type-of-a-lambda-expression-if-an-item-of-a-vector-is-returne#comment47367262_29610318) is also relevant. – dyp Apr 14 '15 at 15:04
  • Oh, that is a bad policy, to disable warnings in system headers. I guess the inability for end users to modify the code to say "yes, I understand, I can deal with this error" is missing. But with template code, when it interacts with a user-type, that lack of warning is ... unfortunate. – Yakk - Adam Nevraumont Apr 14 '15 at 15:58