1

I want to use MurmurHash3 in a deduplication system with no adversary. So Murmurhash3 will hash files, for instance.

However I am having problems using it, meaning I am doing something wrong.

Murmurhash3_x86_128() (source-code) function receives four parameters. This is my understanding of what they are:

key - input data to hash

len - data length

seed - seed

out - computed hash value

When running it fails with segmentation faults, because of this part of the code:

    void MurmurHash3_x86_128 ( const void * key, const uint32_t len,
                               uint32_t seed, void * out )
    {
     const uint8_t * data = (const uint8_t*)key;
     const uint32_t nblocks = len / 16;
     ...

     const uint32_t * blocks = (const uint32_t *)(data + nblocks*16);

     for(i = -nblocks; i; i++)
     {
            uint32_t k1 = blocks[i*4];
            ...
     }
     ...
    }

So if my data has length greater than 15 bytes (which is the case), this for loop is executed. However, blocks is pointed to the end of my data array and then it starts accessing memory positions after that position. The segmentation faults are explained. So key can't be just my data array.

My question is: What should I put in key parameter?


Problem Solved

After the answer of Mats Petersson I realized my code had a bug. i must be an int (signed) and I had it unsigned. That is the reason why it was adding memory positions to blocks and not subtracting.

Leaurus
  • 376
  • 3
  • 13

1 Answers1

1

blocks points at the last even multiple of 16 bytes in the block being calculated.

i starts at -nblocks, and is always less than zero (loop ends at zero).

So, say you have 64 bytes of data, then the pointer blocks will point at data + 64 bytes, and nblocks will be 4.

WHen we get to k1 = blocks[i*4]; the first time, i = -4, so wer get index -16 - which is multiplied by the sizeof(*blocks), that is 4 (int = 4 bytes in most architectures) - so we get -64 = start address of data.

Mats Petersson
  • 126,704
  • 14
  • 140
  • 227
  • Now I understand! I had to adapt the cpp version to c, and I switched i from int to uint, by mistake. That's why it was adding memory positions and not subtracting! Thanks!! – Leaurus Jan 27 '13 at 20:50