2

I have a C++ function that takes a std::function as an input argument.
Specifically, a std::function<void (const Message&, Error)>.

In my use-case, the caller may bind the std::function to either a free function or a member function.

(I'm not experienced with std::bind or std::function, so I found it noteworthy that the same object type, std::function<void (const Message&, Error)>, can be bound to a free function as well as a member function -- the latter by using std::bind. I found it interesting because it seemed to abstract away the difference between a function pointer and a member function pointer (at least it gave me that impression))

For my debugging need, it would be useful to log a hash, something unique, associated with the std::function input argument.
Here's where I quickly realized I can't escape that fundamental difference between free function pointers and member function pointers.
I can get the underlying void (*)(const Message&, Error) free function pointer using std::function::target<void (*)(const Message&, Error)>(), which serves my needs as a unique hash.
But that doesn't work if the std::function<void (const Message&, Error)> is bound to a member function.
In my head, I reasoned that if the std::function<void (const Message&, Error)> was bound to a class Foo member function, then std::function::target<void (Foo::*)(const Message&, Error)>() would return the pointer to a member function pointer -- but that didn't seem to be the case.

Which leads to my question: is there any way to generically get a unique hash from a std::function instance regardless whether it's bound to a free function or a member function?

#include <functional>
#include <iostream>

using namespace std;

struct Message {
  int i_;
};

struct Error {
  char c_;
};

class Foo {
public:
  void print(const Message& m, Error e) {
    cout << "member func: " << m.i_ << " " << e.c_ << endl;
  }
};

void print(const Message& m, Error e) {
  cout << "free func: " << m.i_ << " " << e.c_ << endl;
};

void doWork(function<void (const Message&, Error)> f) {
  // I can invoke f regardless of whether it's been bound to a free function or
  // a member function...
  {
    Message m{42};
    Error e{'x'};

    f(m, e);
  }

  // ...but since I don't know whether f is bound to a free function or a member
  // function, I can't use std::function::target<>() to generically get a
  // function pointer, whose (void*) value would have served my need for a
  // hash...
  {
    typedef void (*Fptr)(const Message&, Error);
    typedef void (Foo::*Mfptr)(const Message&, Error);

    Fptr* fptr = f.target<Fptr>();
    Mfptr* mfptr = nullptr;

    cout << "free func target: " << (void*)fptr << endl;

    if (fptr) {
      cout << "free func hash: " << (void*)*fptr << endl;
    }
    else {
      // ...moreover, when f is bound to a Foo member function (using
      // std::bind), std::function::target<>() doesn't return a Foo member
      // function pointer either...I can't reason why not.
      // (this also isn't scalable because in future, f may be bound to a 
      // class Bar or class Baz member function)
      mfptr = f.target<Mfptr>();
      cout << "not a free function; checking for a Foo member function" << endl;
      cout << "member func target: " << (void*)mfptr << endl;

      if (mfptr) {
        cout << "member func hash: " << (void*)*mfptr << endl;
      }
    }
  }
}

int main()
{
  {
    function<void (const Message&, Error)> f = print;

    doWork(f);
  }

  cout << "---" << endl;

  {
    Foo foo;
    function<void (const Message&, Error)> f = bind(&Foo::print,
                                                    &foo,
                                                    placeholders::_1,
                                                    placeholders::_2);

    doWork(f);
  }

  return 0;
}

Compilation and output:

$ g++ --version && g++ -g ./main.cpp && ./a.out
g++ (Debian 8.3.0-6) 8.3.0
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

free func: 42 x
free func target: 0x7ffda4547bf0
free func hash: 0x55db499c51e5
---
member func: 42 x
free func target: 0
not a free function; checking for a Foo member function
member func target: 0
Gabriel Staples
  • 36,492
  • 15
  • 194
  • 265
StoneThrow
  • 5,314
  • 4
  • 44
  • 86
  • 1
    how about.... `std::function::target_type().name()`? – KamilCuk Apr 26 '23 at 20:16
  • Possibly related: https://stackoverflow.com/questions/73744246/unique-id-for-any-kind-of-callable-object-in-c17 – joergbrech Apr 26 '23 at 20:18
  • Each function is just a pointer to an address in memory. Can't you just use that address? Ex: `size_t addr = (size_t)my_func;`. If two functions are equal, they have the same address. – Gabriel Staples Apr 26 '23 at 20:18
  • @KamilCuk -- looks promising; I'll experiment with this some more, and if you are inclined to expand your comment into an answer, I'd be happy to upvote/accept. – StoneThrow Apr 26 '23 at 20:21
  • 1
    @GabrielStaples -- that's exactly what I figured, and was trying to do. The issue is _getting that address_. I am not allowed to change the signature of `doWork()`, so I can only work with whatever `std::function` provides. – StoneThrow Apr 26 '23 at 20:23
  • If you know the exact way the `bind` was created, you may be able to obtain its type via `decltype(std::bind(...))`. And then `target<...>` will point to the instance. The problem is, it is likely allocated locally inside the `std::function`, meaning the address of target will change for each instance of `std::function` and there isn't any suitable interface for an output `bind` to tell anything about its intrinsics. – ALX23z Apr 26 '23 at 20:47

1 Answers1

3

The following code:

#include <functional>
#include <iostream>
#include <vector>
#include <string>
#include <cstdint>

int f(int a) { return -a; }
int f2(int a) { return a; }

int main() {
    std::vector<std::function<int(int)>> fn{
        f,
        f,
        f2,
        f2,
        [](int a) {return -a;},
        [](int a) {return -a;},
        [](int a) {return -a;},
    };

    for (auto&& a : fn) {
        const auto t = a.target<int(*)(int)>();
        const auto hash = t ?
            (size_t)(uintptr_t)(void*)*t :
            a.target_type().hash_code();
        std::cout << hash << '\n';
    }
}

Initialized vector of two f functions, two f2 functions, and 3 lambda functions. Thus we are expecting two same hashes, two same hashes, and each lambda is a new type - 3 different hashes. The code outputs:

4198918
4198918
4198932
4198932
11513669940284151167
7180698749978361212
13008242069459866308
KamilCuk
  • 120,984
  • 8
  • 59
  • 111
  • 3
    This results in the same hash if lambda/functor comes from some factory, e.g. for `auto factory(int x){ return [x](int y){ return x + y;}; }` we have `factory(10)` and `factory(20)` share the type. – R2RT Apr 26 '23 at 20:51
  • 2
    It may not matter for OP nor invalidates the answer, but I thought it should be said as footnote. – R2RT Apr 26 '23 at 20:54
  • 2
    Is there any advantage to hashing the `target_type().name()`s (name-mangled lambda function names) with `std::hash{}(a.target_type().name())` rather than just obtaining the [`hash_code`](https://en.cppreference.com/w/cpp/types/type_info/hash_code) directly via `target_type().hash_code()`? – Gabriel Staples Apr 27 '23 at 00:44
  • Why cast the function address, `*t`, to `void*` and then to `uintptr_t` and _then_ to `size_t` via `(size_t)(uintptr_t)(void*)*t` instead of just casting straight to `size_t` via `(size_t)*t`? I've always just cast pointers directly to `size_t` with no intermediate casts in-between. – Gabriel Staples Apr 28 '23 at 00:30
  • Note: both [`hash_code`](https://en.cppreference.com/w/cpp/types/type_info/hash_code) and the return value of calling the [`std::hash<>::operator()`](https://en.cppreference.com/w/cpp/utility/hash/operator()) callable operator are `std::size_t`, so I recommend removing the `const auto hash =` usage of auto and saying `const std::size_t hash = ` instead. `auto` obfuscates the type of the `hash` variable in an undesirable way otherwise. – Gabriel Staples Apr 28 '23 at 00:37
  • `Why cast the function address, *t` It is undefined behavior to convert a function pointer to an integer, converting to a pointer is conditionally supported https://eel.is/c++draft/expr#reinterpret.cast-8 . Converting a pointer to an integer type not large enough is undefined behavior: https://port70.net/~nsz/c/c11/n1570.html#6.3.2.3p6 and https://eel.is/c++draft/expr#reinterpret.cast-4 . If `size_t` would not be large enough to store the result of pointer->integer conversion, the behavior would be undefined. The proper way is `void *`, then `uintptr_t` which is guaranteed to be large enough – KamilCuk Apr 28 '23 at 06:48
  • 1
    But, I will agree, this is all moot. Because the conversions themselves are implementation defined, we might as well expect compilers to just support `(size_t)*t` in an implementation defined manner instead of doing standard C++ shenanigans to avoid standard C++ undefined behavior. We all know only 3 C++ compilers matter. But from the other side, you never know when writers of optimizer part of the compiler will kick in, decide that this code is undefined behavior and optimize it all out. So it is what it is. – KamilCuk Apr 28 '23 at 06:53
  • Which 3 C++ compilers? Microsoft Visual C++, GNU gcc/g++, and LLVM clang? That's my guess. Then again, I think Apple/Mac has their own compiler too, and many microcontrollers have some other compilers too, frequently with only partial language implementations I think. – Gabriel Staples Apr 28 '23 at 16:24