0

I've got a variable size binary data blob that is accessible through a std::byte pointer and a byte size. Is there a way to call std::hash on it? (I can't find any example for this)

moala
  • 5,094
  • 9
  • 45
  • 66

2 Answers2

2

Here's an example based on my comment:

#include <span>
#include <iostream>
#include <string_view>

using bytes = std::span<const std::byte>;

template <>
struct std::hash<bytes>
{
    std::size_t operator()(const bytes& x) const noexcept
    {
        return std::hash<std::string_view>{}({reinterpret_cast<const char*>(x.data()), x.size()});
    }
};

int main()
{
    auto integers = std::array { 1, 2, 3, 4, 5 };
    auto doubles = std::array { 1.2, 3.4, 5.6, 7.8, 9.0 };
    std::string string = "A string";
    
    auto hash = std::hash<bytes>{};

    std::cout << std::hex;
    std::cout << hash(std::as_bytes(std::span(integers))) << "\n";
    std::cout << hash(std::as_bytes(std::span(doubles))) << "\n";
    std::cout << hash(std::as_bytes(std::span(string))) << "\n";
}
5ae7ce2c85367f67
42515290ba6efcdb
d760fec69e30b9e1

Of course, I used std::span, but you could use anything else (e.g. std::pair<const std::byte*, std::size_t>).

Alternatively, you could use MurmurHash (I believe that's what libstdc++ uses internally anyway).

LHLaurini
  • 1,737
  • 17
  • 31
1

As far as I can tell, you have to create a custom hash function:

#include <algorithm>
#include <array>
#include <iostream>
#include <vector>

template<class T, size_t N>
struct std::hash<std::array<T, N>> {
    std::size_t operator()(const std::array<T, N>& data) const noexcept {
        std::size_t value = data.size();
        for (auto byte : data) {
            value = std::hash<std::byte>{}(byte) + (value << 6) + (value >> 2);
            // value = value * 31 + std::hash<std::byte>{}(byte);
        }
        return value;
    }
};

template<>
struct std::hash<std::byte*> {
    std::size_t operator()(const std::byte *ptr, std::size_t n) const noexcept {
        std::size_t value = n;
        for (auto i = 0; i < n; ++i)
            value = value * 31 + std::hash<std::byte>{}(ptr[i]);
        return value;
    }
};

int main(int argc, const char *argv[]) {

    std::array<std::byte, 4> bytes{std::byte{0}, std::byte{1}, std::byte{2}, std::byte{3}};
    std::cout << std::hash<std::array<std::byte, 4>>{}(bytes) << std::endl;
    std::cout << std::hash<std::byte*>{}(bytes.data(), bytes.size()) << std::endl;
        
    return 0;
}

Output:

3695110
3695110

The types can be adjusted as necessary for your use case.

Edit: I added an alternative hash combiner (that is commented out) that in theory may produce better results.

Edit2: Added additional hash functor that takes a pointer and size.

RandomBits
  • 4,194
  • 1
  • 17
  • 30
  • Your example does not address a variable-size binary blob, but could be adapted to address it. – moala Apr 21 '23 at 19:40
  • @moala You are right, I missed the target. I added another hash functor that takes a pointer and size as indicated by the OP. – RandomBits Apr 21 '23 at 20:01
  • 1
    It's not clear why you use specialization of `std::hash`. The implementation doesn't satisfy [Hash](https://en.cppreference.com/w/cpp/named_req/Hash) requirements and can't be used where such requirement is expected (e.g. in unordered containers). – Evg Apr 21 '23 at 20:15