3

I'm trying to create a C++11 implementation of Don Clugston's Member Function Pointers and the Fastest Possible C++ Delegates, and make it work as a drop-in std::function replacement.

This is what I got so far.

I construct lambda FastDelegates like this:

// FastFunc is my name for FastDelegate
template<typename LambdaType> FastFunc(LambdaType lambdaExpression)
{
    this->m_Closure.bindmemfunc(&lambdaExpression, &LambdaType::operator());
}

Now, some tests:

FastFunc<void()> test = []{ std::cout << "hello" << std::endl; };
test();
// Correctly prints "hello"

bool b{false};
FastFunc<void()> test2 = [&b]{ std::cout << b << std::endl; };
test2();
// Crash!

As you can see, when the lambda is "trivial" (no captures), copying it by value and taking its address works. But when the lambda stores some kind of state (captures), I cannot just copy it by value into the FastFunc.

I tried getting the lambda by reference, but I cannot do that when it's a temporary like in the example.

I have to somehow store the lambda inside the FastFunc, but I don't want to use std::shared_ptr because it's slow (I tried a different fastdelegate implementation that used it, and its performance was comparable to std::function).

How can I make my implementation of Don Clugston's fastest possible C++ delegates work with lambdas that capture state, preserving the amazing performance of fastdelegates?

Community
  • 1
  • 1
Vittorio Romeo
  • 90,666
  • 33
  • 258
  • 416
  • 2
    It's because you cannot convert a capturing lambda into a function pointer: you need type erasure and `std::shared_ptr` provides it. – user1095108 Sep 05 '13 at 10:22
  • @user1095108: if you copy a `std::function` encapsulating a mutable lambda (aka, lambda with mutable state) do the two `std::function` instances share the same state, or does each one have its own copy ? – Matthieu M. Sep 05 '13 at 14:58
  • @MatthieuM. each one has it's own copy AFAIK. – user1095108 Sep 05 '13 at 15:23
  • @user1095108: okay, then it probably does not use `std::shared_ptr` because the type erasure would prevent copying the inner lambda (no `clone` method there). – Matthieu M. Sep 05 '13 at 15:48
  • @MatthieuM. No, it uses either placement `new` or copies into a `new`ly allocated block. Good point about mutables hehehe. – user1095108 Sep 05 '13 at 17:16
  • @MatthieuM. This is why, btw, `std::function` is often a disaster if you use it with `` algorithms. – user1095108 Sep 05 '13 at 17:30
  • @user1095108: Yes, it is unfortunate that the algorithm were specified as taking *both* iterators and predicates by value. It means that anytime your iterator or predicate is fat then you risk performance issues :( – Matthieu M. Sep 06 '13 at 06:07
  • A bit late to the party, but am I the only one who noticed that the code above is taking the address of lambdaExpression, which is destroyed when the function finishes, thus creating a dangling pointer which is probably responsible for the crash? Maybe lambdaExpression should be taken by const reference? – SomeProgrammer Feb 10 '21 at 09:10

4 Answers4

8

You have diagnosed the situation well: you need to store the state.

Since the lambda is a temporary object, you are actually allowed to move from it (normally) which should be preferred to a copy if possible (because move is more general than copy).

Now, all you need to do is to reserve some storage for it, and if this requires a dynamic allocation you might indeed get a performance degradation. On the other hand, an object need have a fixed foot-print, so ?

One possible solution is to offer a configurable (but limited) storage capacity:

static size_t const Size = 32;
static size_t const Alignment = alignof(std::max_align_t);

typedef std::aligned_storage<Size, Alignment>::type Storage;
Storage storage;

Now you can (using reinterpret_cast as necessary) store your lambda within storage provided its size fit (which can be detected using static_assert).

Finally managed to get a working example (had to restart from scratch because god is that fast delegate code verbose !!), you can see it in action here (and the code is below).

I have only scratch the surface, notably because it lacks copy and move operators. To do so properly those operations need be added to the handler following the same pattern than the two other operations.

Code:

#include <cstddef>

#include <iostream>
#include <memory>
#include <type_traits>

template <typename, size_t> class FastFunc;

template <typename R, typename... Args, size_t Size>
class FastFunc<R(Args...), Size> {
public:
    template <typename F>
    FastFunc(F f): handler(&Get<F>()) {
        new (&storage) F(std::move(f));
    }

    ~FastFunc() {
        handler->destroy(&storage);
    }

    R operator()(Args&&... args) {
      return handler->apply(&storage, std::forward<Args>(args)...);
    }

private:
    using Storage = typename std::aligned_storage<Size, alignof(max_align_t)>::type;

    struct Handler {
        R (*apply)(void*, Args&&...);
        void (*destroy)(void*);
    }; // struct Handler

    template <typename F>
    static R Apply(void* f, Args&&... args) {
        (*reinterpret_cast<F*>(f))(std::forward<Args>(args)...);
    }

    template <typename F>
    static void Destroy(void* f) {
        reinterpret_cast<F*>(f)->~F();
    }

    template <typename F>
    Handler const& Get() {
        static Handler const H = { &Apply<F>, &Destroy<F> };
        return H;
    } // Get

    Handler const* handler;
    Storage storage;
}; // class FastFunc

int main() {
    FastFunc<void(), 32> stateless = []() { std::cout << "stateless\n"; };
    stateless();

    bool b = true;
    FastFunc<void(), 32> stateful = [&b]() { std::cout << "stateful: " << b << "\n"; };
    stateful();

    b = false;
    stateful();

    return 0;
}
Matthieu M.
  • 287,565
  • 48
  • 449
  • 722
  • `std::aligned_storage` sounds great! However, I'm having trouble moving the lambda *inside* the storage. Can you show a simple example? The ones on cppreference didn't help. – Vittorio Romeo Sep 05 '13 at 11:32
  • 1
    @VittorioRomeo Just copy or move it in there, e.g. `new (&storage) LambdaType(lambdaExpression);`. But it ain't gonna be faster, as `storage` location now depends on the `this` pointer, which means inlining the invocation is now impossible – user1095108 Sep 05 '13 at 11:36
  • @VittorioRomeo: I was working on it, but that fast delegate god was so awful that I went from scratch to something a tad simpler. It might not be fast any longer mind (no performance guarantee is implied!), however it does showcase how to copy/move state. – Matthieu M. Sep 05 '13 at 11:54
  • 4
    Doesn't this amount to reimplementing `std::function` (and virtual functions too!)? – R. Martinho Fernandes Sep 05 '13 at 12:07
  • @R.MartinhoFernandes: Well... that's what fast delegate is all about (though it was invented at a time where `std::function` did not exist). There are small differences though: 1/ `std::function` uses heap allocation (in general) so this is potentially faster, 2/ it uses less spaces than using virtual functions would (potentially) as for virtual functions it seems you would need to store both the virtual pointer (in storage) and a pointer to the base class (outside storage). – Matthieu M. Sep 05 '13 at 12:44
  • @Matthieu: Your version only avoids heap allocation by being less generic- for example, the size limitation. Most implementations of std::function contain a small object buffer which they can use if you are using a small functor. – Puppy Sep 05 '13 at 12:46
  • @MatthieuM. `std::function` implementations do small buffer optimisation, which is just like this, except with a fallback to support all scenarios (the heap). (Why wouldn't they? If there's an obvious optimisation that works, why would the stdlib skip it?) – R. Martinho Fernandes Sep 05 '13 at 12:52
  • [Some naive benchmarking.](http://pastebin.com/FRCJuUYi) `raw func` and `raw memfunc` is just calling the functions normally. `don_delegate` is my implementation of Don's FastDelegate, with `shared_ptr` instead of `aligned_storage`, because of the reasons explained by @user1095108. `fastfunc2` is @Matthieu M.'s above implementation. `fastfunc2` is @Matthieu M.'s above implmentation using a `shared_ptr` instead of `aligned_storage`. [Benchmark code](http://pastebin.com/Ncpxgm9v) – Vittorio Romeo Sep 05 '13 at 12:57
  • Your numbers are irrelevant, since they don't compare apples and oranges. Neither fastdelegates nor your efforts nor Matthieu's are as generic as std::function. – Puppy Sep 05 '13 at 13:01
  • @DeadMG: what can `std::function` do that my implementation of Don's fastdelegate cannot? – Vittorio Romeo Sep 05 '13 at 13:02
  • @VittorioRomeo: Er, how about "Store capturing lambdas"- the very reason you're asking the question in the first place? – Puppy Sep 05 '13 at 13:04
  • @DeadMG: the benchmarks I posted show "stored capturing lambdas". I used `std::aligned_storage` to do that at first, as this answer suggested, then tried `std::shared_ptr`, which is faster. Is there anything left to implement that `std::function` can do? – Vittorio Romeo Sep 05 '13 at 13:06
  • @VittorioRomeo There's one thing you can still do. Explore using stack allocators and allocators allocating from static arrays. That will beat `std::shared_ptr` for sure. `Boost.Pool` comes to mind. – user1095108 Sep 05 '13 at 13:27
  • @MatthieuM. : your FastFunc implementation is as fast as Don's FastDelegate (but a lot cleaner) in all cases except one: global functions. I've spent the last hour trying to somehow bind global functions in it but with no success. Any idea how I can handle global functions in a separate way so that calling them via your `FastFunc` is as fast as calling them normally? – Vittorio Romeo Sep 05 '13 at 14:14
  • @VittorioRomeo: I was looking at your benchmark code and note that the loop bodies do not have any side-effects... thus potentially being completely optimized away (which would explain the `0 ms` cases). As for beating Don's FastDelegate on global functions... no idea how he does it (have not look into its code too deeply). My trampoline (the `apply` function) on top of aliasing might be getting in the way of optimization. I am afraid this would need to be investigated in-depth (looking the compiler IR and see where the optimizer chokes) and I don't have the time right now :x – Matthieu M. Sep 05 '13 at 14:55
  • @MatthieuM. tried adding simple side effects (and making sure to track them). It's still `0 ms`. My guess is that those *raw* functions are being completely inlined by the compiler. – Vittorio Romeo Sep 05 '13 at 19:29
  • @VittorioRomeo: That would be my guess too, I suppose the magic going on in my quick and dirty implementation is too clever for compilers to see through :( – Matthieu M. Sep 06 '13 at 06:08
1

You can't.

Here's the thing. Fastdelegates only works for a very few, very specific circumstances. That's what makes it faster. You won't beat your Standard library implementer for implementing std::function.

Puppy
  • 144,682
  • 38
  • 256
  • 465
1

I've made a solution to fit a lambda function as pointer only into a FastDelegate (it does not store anything else) using hard labor and a couple of other thread such as: Get lambda parameter type

here it is:

namespace details{

    template<class FPtr> struct function_traits;
    template<class RT, class CT                                                                                        >struct function_traits<RT (CT::*)(                                      )     >{ typedef RT Result;                                                                                                                                                                  typedef RT (CT::*Signature)(                                      );};
    template<class RT, class CT                                                                                        >struct function_traits<RT (CT::*)(                                      )const>{ typedef RT Result;                                                                                                                                                                  typedef RT (CT::*Signature)(                                      );};
    template<class RT                                                                                                  >struct function_traits<RT        (                                      )     >{ typedef RT Result;                                                                                                                                                                  typedef RT       Signature (                                      );};
    template<class RT, class CT, class P1T                                                                             >struct function_traits<RT (CT::*)(P1T                                   )     >{ typedef RT Result;  typedef P1T Param1;                                                                                                                                             typedef RT (CT::*Signature)(P1T                                   );};
    template<class RT, class CT, class P1T                                                                             >struct function_traits<RT (CT::*)(P1T                                   )const>{ typedef RT Result;  typedef P1T Param1;                                                                                                                                             typedef RT (CT::*Signature)(P1T                                   );};
    template<class RT          , class P1T                                                                             >struct function_traits<RT        (P1T                                   )     >{ typedef RT Result;  typedef P1T Param1;                                                                                                                                             typedef RT       Signature (P1T                                   );};
    template<class RT, class CT, class P1T, class P2T                                                                  >struct function_traits<RT (CT::*)(P1T, P2T                              )     >{ typedef RT Result;  typedef P1T Param1; typedef P2T Param2;                                                                                                                         typedef RT (CT::*Signature)(P1T, P2T                              );};
    template<class RT, class CT, class P1T, class P2T                                                                  >struct function_traits<RT (CT::*)(P1T, P2T                              )const>{ typedef RT Result;  typedef P1T Param1; typedef P2T Param2;                                                                                                                         typedef RT (CT::*Signature)(P1T, P2T                              );};
    template<class RT          , class P1T, class P2T                                                                  >struct function_traits<RT        (P1T, P2T                              )     >{ typedef RT Result;  typedef P1T Param1; typedef P2T Param2;                                                                                                                         typedef RT       Signature (P1T, P2T                              );};
    template<class RT, class CT, class P1T, class P2T, class P3T                                                       >struct function_traits<RT (CT::*)(P1T, P2T, P3T                         )     >{ typedef RT Result;  typedef P1T Param1; typedef P2T Param2; typedef P3T Param3;                                                                                                     typedef RT (CT::*Signature)(P1T, P2T, P3T                         );};
    template<class RT, class CT, class P1T, class P2T, class P3T                                                       >struct function_traits<RT (CT::*)(P1T, P2T, P3T                         )const>{ typedef RT Result;  typedef P1T Param1; typedef P2T Param2; typedef P3T Param3;                                                                                                     typedef RT (CT::*Signature)(P1T, P2T, P3T                         );};
    template<class RT          , class P1T, class P2T, class P3T                                                       >struct function_traits<RT        (P1T, P2T, P3T                         )     >{ typedef RT Result;  typedef P1T Param1; typedef P2T Param2; typedef P3T Param3;                                                                                                     typedef RT       Signature (P1T, P2T, P3T                         );};
    template<class RT, class CT, class P1T, class P2T, class P3T, class P4T                                            >struct function_traits<RT (CT::*)(P1T, P2T, P3T, P4T                    )     >{ typedef RT Result;  typedef P1T Param1; typedef P2T Param2; typedef P3T Param3; typedef P4T Param4;                                                                                 typedef RT (CT::*Signature)(P1T, P2T, P3T, P4T                    );};
    template<class RT, class CT, class P1T, class P2T, class P3T, class P4T                                            >struct function_traits<RT (CT::*)(P1T, P2T, P3T, P4T                    )const>{ typedef RT Result;  typedef P1T Param1; typedef P2T Param2; typedef P3T Param3; typedef P4T Param4;                                                                                 typedef RT (CT::*Signature)(P1T, P2T, P3T, P4T                    );};
    template<class RT          , class P1T, class P2T, class P3T, class P4T                                            >struct function_traits<RT        (P1T, P2T, P3T, P4T                    )     >{ typedef RT Result;  typedef P1T Param1; typedef P2T Param2; typedef P3T Param3; typedef P4T Param4;                                                                                 typedef RT       Signature (P1T, P2T, P3T, P4T                    );};
    template<class RT, class CT, class P1T, class P2T, class P3T, class P4T, class P5T                                 >struct function_traits<RT (CT::*)(P1T, P2T, P3T, P4T, P5T               )     >{ typedef RT Result;  typedef P1T Param1; typedef P2T Param2; typedef P3T Param3; typedef P4T Param4; typedef P5T Param5;                                                             typedef RT (CT::*Signature)(P1T, P2T, P3T, P4T, P5T               );};
    template<class RT, class CT, class P1T, class P2T, class P3T, class P4T, class P5T                                 >struct function_traits<RT (CT::*)(P1T, P2T, P3T, P4T, P5T               )const>{ typedef RT Result;  typedef P1T Param1; typedef P2T Param2; typedef P3T Param3; typedef P4T Param4; typedef P5T Param5;                                                             typedef RT (CT::*Signature)(P1T, P2T, P3T, P4T, P5T               );};
    template<class RT          , class P1T, class P2T, class P3T, class P4T, class P5T                                 >struct function_traits<RT        (P1T, P2T, P3T, P4T, P5T               )     >{ typedef RT Result;  typedef P1T Param1; typedef P2T Param2; typedef P3T Param3; typedef P4T Param4; typedef P5T Param5;                                                             typedef RT       Signature (P1T, P2T, P3T, P4T, P5T               );};
    template<class RT, class CT, class P1T, class P2T, class P3T, class P4T, class P5T, class P6T                      >struct function_traits<RT (CT::*)(P1T, P2T, P3T, P4T, P5T, P6T          )     >{ typedef RT Result;  typedef P1T Param1; typedef P2T Param2; typedef P3T Param3; typedef P4T Param4; typedef P5T Param5; typedef P6T Param6;                                         typedef RT (CT::*Signature)(P1T, P2T, P3T, P4T, P5T, P6T          );};
    template<class RT, class CT, class P1T, class P2T, class P3T, class P4T, class P5T, class P6T                      >struct function_traits<RT (CT::*)(P1T, P2T, P3T, P4T, P5T, P6T          )const>{ typedef RT Result;  typedef P1T Param1; typedef P2T Param2; typedef P3T Param3; typedef P4T Param4; typedef P5T Param5; typedef P6T Param6;                                         typedef RT (CT::*Signature)(P1T, P2T, P3T, P4T, P5T, P6T          );};
    template<class RT          , class P1T, class P2T, class P3T, class P4T, class P5T, class P6T                      >struct function_traits<RT        (P1T, P2T, P3T, P4T, P5T, P6T          )     >{ typedef RT Result;  typedef P1T Param1; typedef P2T Param2; typedef P3T Param3; typedef P4T Param4; typedef P5T Param5; typedef P6T Param6;                                         typedef RT       Signature (P1T, P2T, P3T, P4T, P5T, P6T          );};
    template<class RT, class CT, class P1T, class P2T, class P3T, class P4T, class P5T, class P6T, class P7T           >struct function_traits<RT (CT::*)(P1T, P2T, P3T, P4T, P5T, P6T, P7T     )     >{ typedef RT Result;  typedef P1T Param1; typedef P2T Param2; typedef P3T Param3; typedef P4T Param4; typedef P5T Param5; typedef P6T Param6; typedef P7T Param7;                     typedef RT (CT::*Signature)(P1T, P2T, P3T, P4T, P5T, P6T, P7T     );};
    template<class RT, class CT, class P1T, class P2T, class P3T, class P4T, class P5T, class P6T, class P7T           >struct function_traits<RT (CT::*)(P1T, P2T, P3T, P4T, P5T, P6T, P7T     )const>{ typedef RT Result;  typedef P1T Param1; typedef P2T Param2; typedef P3T Param3; typedef P4T Param4; typedef P5T Param5; typedef P6T Param6; typedef P7T Param7;                     typedef RT (CT::*Signature)(P1T, P2T, P3T, P4T, P5T, P6T, P7T     );};
    template<class RT          , class P1T, class P2T, class P3T, class P4T, class P5T, class P6T, class P7T           >struct function_traits<RT        (P1T, P2T, P3T, P4T, P5T, P6T, P7T     )     >{ typedef RT Result;  typedef P1T Param1; typedef P2T Param2; typedef P3T Param3; typedef P4T Param4; typedef P5T Param5; typedef P6T Param6; typedef P7T Param7;                     typedef RT       Signature (P1T, P2T, P3T, P4T, P5T, P6T, P7T     );};
    template<class RT, class CT, class P1T, class P2T, class P3T, class P4T, class P5T, class P6T, class P7T, class P8T>struct function_traits<RT (CT::*)(P1T, P2T, P3T, P4T, P5T, P6T, P7T, P8T)     >{ typedef RT Result;  typedef P1T Param1; typedef P2T Param2; typedef P3T Param3; typedef P4T Param4; typedef P5T Param5; typedef P6T Param6; typedef P7T Param7; typedef P8T Param8; typedef RT (CT::*Signature)(P1T, P2T, P3T, P4T, P5T, P6T, P7T, P8T);};
    template<class RT, class CT, class P1T, class P2T, class P3T, class P4T, class P5T, class P6T, class P7T, class P8T>struct function_traits<RT (CT::*)(P1T, P2T, P3T, P4T, P5T, P6T, P7T, P8T)const>{ typedef RT Result;  typedef P1T Param1; typedef P2T Param2; typedef P3T Param3; typedef P4T Param4; typedef P5T Param5; typedef P6T Param6; typedef P7T Param7; typedef P8T Param8; typedef RT (CT::*Signature)(P1T, P2T, P3T, P4T, P5T, P6T, P7T, P8T);};
    template<class RT          , class P1T, class P2T, class P3T, class P4T, class P5T, class P6T, class P7T, class P8T>struct function_traits<RT        (P1T, P2T, P3T, P4T, P5T, P6T, P7T, P8T)     >{ typedef RT Result;  typedef P1T Param1; typedef P2T Param2; typedef P3T Param3; typedef P4T Param4; typedef P5T Param5; typedef P6T Param6; typedef P7T Param7; typedef P8T Param8; typedef RT       Signature (P1T, P2T, P3T, P4T, P5T, P6T, P7T, P8T);};

    template<class T>
    typename function_traits<T>::Signature* bar_helper(T);

    template<class F>
    class FuncTraitsOf{
    public:
        typedef decltype(bar_helper(&F::operator())) fptr;
        typedef typename std::remove_pointer<fptr>::type Signature;     //Signature =   bool __cdecl(int,float)
        typedef typename function_traits< Signature > R;                //R         =   struct function_traits<bool __cdecl(int,float)>
    };

    template< class FuncTraits>class FDSel;
    template<class RT, class CT                                                                                        > struct FDSel< function_traits< RT (CT::*)(                                      )      > >{ typedef fastdelegate::FastDelegate0<                                       RT> R; };
    template<class RT, class CT                                                                                        > struct FDSel< function_traits< RT (CT::*)(                                      )const > >{ typedef fastdelegate::FastDelegate0<                                       RT> R; };
    template<class RT                                                                                                  > struct FDSel< function_traits< RT        (                                      )      > >{ typedef fastdelegate::FastDelegate0<                                       RT> R; };
    template<class RT, class CT, class P1T                                                                             > struct FDSel< function_traits< RT (CT::*)(P1T                                   )      > >{ typedef fastdelegate::FastDelegate1<P1T                                   ,RT> R; };
    template<class RT, class CT, class P1T                                                                             > struct FDSel< function_traits< RT (CT::*)(P1T                                   )const > >{ typedef fastdelegate::FastDelegate1<P1T                                   ,RT> R; };
    template<class RT          , class P1T                                                                             > struct FDSel< function_traits< RT        (P1T                                   )      > >{ typedef fastdelegate::FastDelegate1<P1T                                   ,RT> R; };
    template<class RT, class CT, class P1T, class P2T                                                                  > struct FDSel< function_traits< RT (CT::*)(P1T, P2T                              )      > >{ typedef fastdelegate::FastDelegate2<P1T, P2T                              ,RT> R; };
    template<class RT, class CT, class P1T, class P2T                                                                  > struct FDSel< function_traits< RT (CT::*)(P1T, P2T                              )const > >{ typedef fastdelegate::FastDelegate2<P1T, P2T                              ,RT> R; };
    template<class RT          , class P1T, class P2T                                                                  > struct FDSel< function_traits< RT        (P1T, P2T                              )      > >{ typedef fastdelegate::FastDelegate2<P1T, P2T                              ,RT> R; };
    template<class RT, class CT, class P1T, class P2T, class P3T                                                       > struct FDSel< function_traits< RT (CT::*)(P1T, P2T, P3T                         )      > >{ typedef fastdelegate::FastDelegate3<P1T, P2T, P3T                         ,RT> R; };
    template<class RT, class CT, class P1T, class P2T, class P3T                                                       > struct FDSel< function_traits< RT (CT::*)(P1T, P2T, P3T                         )const > >{ typedef fastdelegate::FastDelegate3<P1T, P2T, P3T                         ,RT> R; };
    template<class RT          , class P1T, class P2T, class P3T                                                       > struct FDSel< function_traits< RT        (P1T, P2T, P3T                         )      > >{ typedef fastdelegate::FastDelegate3<P1T, P2T, P3T                         ,RT> R; };
    template<class RT, class CT, class P1T, class P2T, class P3T, class P4T                                            > struct FDSel< function_traits< RT (CT::*)(P1T, P2T, P3T, P4T                    )      > >{ typedef fastdelegate::FastDelegate4<P1T, P2T, P3T, P4T                    ,RT> R; };
    template<class RT, class CT, class P1T, class P2T, class P3T, class P4T                                            > struct FDSel< function_traits< RT (CT::*)(P1T, P2T, P3T, P4T                    )const > >{ typedef fastdelegate::FastDelegate4<P1T, P2T, P3T, P4T                    ,RT> R; };
    template<class RT          , class P1T, class P2T, class P3T, class P4T                                            > struct FDSel< function_traits< RT        (P1T, P2T, P3T, P4T                    )      > >{ typedef fastdelegate::FastDelegate4<P1T, P2T, P3T, P4T                    ,RT> R; };
    template<class RT, class CT, class P1T, class P2T, class P3T, class P4T, class P5T                                 > struct FDSel< function_traits< RT (CT::*)(P1T, P2T, P3T, P4T, P5T               )      > >{ typedef fastdelegate::FastDelegate5<P1T, P2T, P3T, P4T, P5T               ,RT> R; };
    template<class RT, class CT, class P1T, class P2T, class P3T, class P4T, class P5T                                 > struct FDSel< function_traits< RT (CT::*)(P1T, P2T, P3T, P4T, P5T               )const > >{ typedef fastdelegate::FastDelegate5<P1T, P2T, P3T, P4T, P5T               ,RT> R; };
    template<class RT          , class P1T, class P2T, class P3T, class P4T, class P5T                                 > struct FDSel< function_traits< RT        (P1T, P2T, P3T, P4T, P5T               )      > >{ typedef fastdelegate::FastDelegate5<P1T, P2T, P3T, P4T, P5T               ,RT> R; };
    template<class RT, class CT, class P1T, class P2T, class P3T, class P4T, class P5T, class P6T                      > struct FDSel< function_traits< RT (CT::*)(P1T, P2T, P3T, P4T, P5T, P6T          )      > >{ typedef fastdelegate::FastDelegate6<P1T, P2T, P3T, P4T, P5T, P6T          ,RT> R; };
    template<class RT, class CT, class P1T, class P2T, class P3T, class P4T, class P5T, class P6T                      > struct FDSel< function_traits< RT (CT::*)(P1T, P2T, P3T, P4T, P5T, P6T          )const > >{ typedef fastdelegate::FastDelegate6<P1T, P2T, P3T, P4T, P5T, P6T          ,RT> R; };
    template<class RT          , class P1T, class P2T, class P3T, class P4T, class P5T, class P6T                      > struct FDSel< function_traits< RT        (P1T, P2T, P3T, P4T, P5T, P6T          )      > >{ typedef fastdelegate::FastDelegate6<P1T, P2T, P3T, P4T, P5T, P6T          ,RT> R; };
    template<class RT, class CT, class P1T, class P2T, class P3T, class P4T, class P5T, class P6T, class P7T           > struct FDSel< function_traits< RT (CT::*)(P1T, P2T, P3T, P4T, P5T, P6T, P7T     )      > >{ typedef fastdelegate::FastDelegate7<P1T, P2T, P3T, P4T, P5T, P6T, P7T     ,RT> R; };
    template<class RT, class CT, class P1T, class P2T, class P3T, class P4T, class P5T, class P6T, class P7T           > struct FDSel< function_traits< RT (CT::*)(P1T, P2T, P3T, P4T, P5T, P6T, P7T     )const > >{ typedef fastdelegate::FastDelegate7<P1T, P2T, P3T, P4T, P5T, P6T, P7T     ,RT> R; };
    template<class RT          , class P1T, class P2T, class P3T, class P4T, class P5T, class P6T, class P7T           > struct FDSel< function_traits< RT        (P1T, P2T, P3T, P4T, P5T, P6T, P7T     )      > >{ typedef fastdelegate::FastDelegate7<P1T, P2T, P3T, P4T, P5T, P6T, P7T     ,RT> R; };
    template<class RT, class CT, class P1T, class P2T, class P3T, class P4T, class P5T, class P6T, class P7T, class P8T> struct FDSel< function_traits< RT (CT::*)(P1T, P2T, P3T, P4T, P5T, P6T, P7T, P8T)      > >{ typedef fastdelegate::FastDelegate8<P1T, P2T, P3T, P4T, P5T, P6T, P7T, P8T,RT> R; };
    template<class RT, class CT, class P1T, class P2T, class P3T, class P4T, class P5T, class P6T, class P7T, class P8T> struct FDSel< function_traits< RT (CT::*)(P1T, P2T, P3T, P4T, P5T, P6T, P7T, P8T)const > >{ typedef fastdelegate::FastDelegate8<P1T, P2T, P3T, P4T, P5T, P6T, P7T, P8T,RT> R; };
    template<class RT          , class P1T, class P2T, class P3T, class P4T, class P5T, class P6T, class P7T, class P8T> struct FDSel< function_traits< RT        (P1T, P2T, P3T, P4T, P5T, P6T, P7T, P8T)      > >{ typedef fastdelegate::FastDelegate8<P1T, P2T, P3T, P4T, P5T, P6T, P7T, P8T,RT> R; };


}


template<class F>
typename details::FDSel< typename details::FuncTraitsOf<F>::R >::R MakeDelegate(F& f){
    return fastdelegate::MakeDelegate(&f, &F::operator());
}

Copy/paste that into your FastDelegate.h file.

Do NOT use it like this:

home.visit(fastdelegate::MakeDelegate([&](const Room& a){ /* ... */ }));

Instead do this:

auto d = [&](const Room& a){ /* ... */ };
home.visit(fastdelegate::MakeDelegate(d));

Let me know if I missed anything.

Community
  • 1
  • 1
0

The difference between "trivial" and general lambda functions is that if it doesn't belong to the first class (no captures) it is a function object.

If you copy the object (lambda) and it contains references to temporary objects, or references to stack allocated objects that will be freed before your FastDelegate gets destroyed, you have a dangling reference, hence the crash.

Try capturing by copy, not by reference

Stefano Falasca
  • 8,837
  • 2
  • 18
  • 24
  • Your explanation is correct, but the point is having a faster drop-in for `std::function`, not changing existing lambda code. `std::function` works in the example where the lambda captures the `bool` by reference, but it's slow. My goal is to use the concepts behind Don Clugston's work to create a faster version of `std::function` – Vittorio Romeo Sep 05 '13 at 10:56
  • 1
    @Vittorio: You won't. Fast delegates only really works for a small subset of the scenarios where std::function can work. That's why it's faster. You are very unlikely to beat your Standard library implementer. – Puppy Sep 05 '13 at 12:34
  • @DeadMG Yeah, but sometimes you may not like the interface and then you make your own thing. For example, I don't like to use `std::bind` or lambda objects with `std::function` in order to call member functions. – user1095108 Sep 05 '13 at 12:42
  • 1
    @user1095108: Unless he is going to massively restrict his interface, he is still making a class with the same constraints as std::function, which would almost certainly lead to essentially the same implementation tradeoffs and therefore basically the same performance. – Puppy Sep 05 '13 at 12:43
  • @VittorioRomeo ever asked yourself why the people implementing `std::function` didn't use the concepts behind Don Clugston's work to create a faster version of `std::function`? – R. Martinho Fernandes Sep 05 '13 at 12:54
  • Or in other words: if you manage to get your faster **drop-in replacement** for `std::function`, could you please submit a patch to the standard library implementers so we can all get the goodies? – R. Martinho Fernandes Sep 05 '13 at 12:58
  • @R.MartinhoFernandes: that would be awesome. I doubt it's ever gonna happen (and if it does, I doubt I'll be the one to create a faster **drop-in replacement** for `std::function`). But I still like experimenting with it – Vittorio Romeo Sep 05 '13 at 13:03
  • @DeadMG I think the OPs experimentation is in order; he's looking for performance, a valid reason for a switch. And `std::function` (via boost) hails from C++03 days. There might be a better interface with C++11. – user1095108 Sep 05 '13 at 13:31
  • @DeadMG: what I find absurd is that binding a global raw function to a `std::functions` has *noticeable* overhead. Using Don's fast delegate completely inlines the call to global raw functions for example. I can only think this is a defect in the current `std::function` implementations. – Vittorio Romeo Sep 05 '13 at 19:31
  • @VittorioRomeo That's ridiculous. As if Don's delegate somehow made your go function faster, if called through the delegate :) – user1095108 Sep 06 '13 at 10:57
  • @user1095108: nah, my guess is that, since the raw func is called before anything else, some caching or stuff like that occurs. Re-ordering the benchmarks inverts the timing (`dondelegate` goes to `20ms`, and `rawfunc` goes to `8 ms`). – Vittorio Romeo Sep 06 '13 at 11:02
  • @VittorioRomeo Kinda makes you doubt in your bench. – user1095108 Sep 06 '13 at 11:03