0

I have to use a hash function for my code in C and I found murmurhash 3 (32 bit) hash function I presume.

  1. I have a problem understanding what to input as len and seed.

  2. I inputted arbitrary values as parameters into len and seed, which are 2,000 and 2 respectively but I get very long numbers like -1837466777 or 5738837646 (not accurate but similar in structure to the results I got). I also saw something about bit-masking it and so on.

My question regarding the first is an explanation of len and seed in a simplistic manner.

My question regarding the second is I want to know what to do to that value (if it is a valid return value) to get an actual key that I can use for my hash table

Please make your explanation as simple and broken down as possible. I apologize for my inability to comprehend complex mathematical combinations and advanced theorem, I just need a practical answer so that I can use it immediately and then study the complexities around it later.

Thank you so much and I really appreciate any help.

Here is the code below:

uint32_t murmur3_32(const char *key, uint32_t len, uint32_t seed)
{
    static const uint32_t c1 = 0xcc9e2d51;
    static const uint32_t c2 = 0x1b873593;
    static const uint32_t r1 = 15;
    static const uint32_t r2 = 13;
    static const uint32_t m = 5;
    static const uint32_t n = 0xe6546b64;

    uint32_t hash = seed;

    const int nblocks = len / 4;
    const uint32_t *blocks = (const uint32_t *) key;
    int i;
    for (i = 0; i < nblocks; i++) {
        uint32_t k = blocks[i];
        k *= c1;
        k = (k << r1) | (k >> (32 - r1));
        k *= c2;

        hash ^= k;
        hash = ((hash << r2) | (hash >> (32 - r2))) * m + n;
    }

    const uint8_t *tail = (const uint8_t *) (key + nblocks * 4);
    uint32_t k1 = 0;

    switch (len & 3) {
    case 3:
        k1 ^= tail[2] << 16;
    case 2:
        k1 ^= tail[1] << 8;
    case 1:
        k1 ^= tail[0];

        k1 *= c1;
        k1 = (k1 << r1) | (k1 >> (32 - r1));
        k1 *= c2;
        hash ^= k1;
    }

    hash ^= len;
    hash ^= (hash >> 16);
    hash *= 0x85ebca6b;
    hash ^= (hash >> 13);
    hash *= 0xc2b2ae35;
    hash ^= (hash >> 16);

    return hash;
}
  • 1
    Note the first line: `hash = seed;`. This implies that `seed` is the return value (`hash`) of a prior call to the function. By providing the seed as an argument, it allows the function to be called multiple times on "chunks" of data (e.g. if the total size of the data stream was 10,000 bytes, you could loop on `hash = murmur_32(buf,len,hash);`). Thus, (e.g.) you could loop on a file with `len = read(...,buf,sizeof(buf));` and calls to the hash function and get the hash of the entire file even if it were GB in size. Initial seed could be any value but is probably 0. – Craig Estey Oct 03 '20 at 20:43
  • 0) start by using your own (maybe silly) hash function. 1) find a way to use another hash function. 2) use that. – wildplasser Oct 03 '20 at 20:48

0 Answers0