0

I'm currently implementing a HashTable in C++ using universal hashing (matrix hashing). The way I implement the matrix is by making an array of pointers (which are merely random bits, they do not "work" as pointers but as a 32x64 bit matrix). In order to hash the key, I multiply the pointer key to the matrix (using bit operations), which makes a 32 bits column (our hashed key). This raises a big question:

Is it possible to use a class (more properly, a C++ string) to fill with random bits and do bit operations? I do not care if the data in the string is pure garbage, I just use it to hash. Or, as an alternative, how can I make a 32-byte type and cast a string into one?

  • Why would you want to cast a string to a 32-byte type? Where does the string come from? – Ry- May 06 '17 at 01:36
  • The string is the key for my hash table. And I got the 32 byte from using sizeof(string). – mmiranda96 May 06 '17 at 01:41
  • 1
    the actual std::string object contains a bunch of metadata but may or may not contain the actual string data. (Probably not unless short string optimisation is in operation and your string fits within it). You might want to rethink this. – user1937198 May 06 '17 at 01:45
  • 1
    container for bits is bitset not std::string – PapaDiHatti May 06 '17 at 01:48
  • `sizeof(std::string)` can vary across implementations and even for different optimisation flags on a single implementation - overwriting the `std::string` object with random data is a bad idea. If your needs are met by a [`std::bitset`](http://en.cppreference.com/w/cpp/utility/bitset) well and good, otherwise consider using an array of `uint32_t` or `uint64_t` - then you can index the array and apply normal bitwise operations on the elements. You could build a class to contain the array and allow higher-level operations `operator|` to apply `|` to each element in turn. – Tony Delroy May 06 '17 at 01:54
  • Is it possible to map/cast a string to a 32 byte bitset? – mmiranda96 May 06 '17 at 01:57
  • @mmiranda96 How long is your string? If it is not fixed size there is no way to convert it to a bitset. – user1937198 May 06 '17 at 02:11
  • I often use std::string as a buffer for binary data, including fixed size data. You just need to make sure you use length, never any null-terminated-string based functions. But for your code, I would probably use uint32_t[8] or uint64_t[4]. Whichever makes it easier to do the arithmetic operations on them. Or std::bitset as they said. But using a pointer as if it were a fixed-size integer type is very non-portable. – Kenny Ostrom May 06 '17 at 02:28
  • @mmiranda96 Yes, it's called hashing. – Captain Obvlious May 06 '17 at 02:33
  • Bitset seems the best approach for this particular problem. And hashing makes it great for fixing the size, @CaptainObvlious. – mmiranda96 May 06 '17 at 02:57

0 Answers0