Questions tagged [murmurhash]

MurmurHash is a non-cryptographic hash function suitable for general hash-based lookup.

Murmurhash generates roughly the same number of collisions as alternate hashes over a wide range of input data.

95 questions
1
vote
0 answers

Converting 64 bit hash to integer in range

I'm trying to hash a string to a location in a byte array of length n. I am using the following implementation of the Murmur hash function: https://github.com/tnm/murmurhash-java/blob/master/src/main/java/ie/ucd/murmur/MurmurHash.java in order to…
Alk
  • 5,215
  • 8
  • 47
  • 116
1
vote
2 answers

Murmurhash2 Unsigned Int overflow

I'm currently trying to implement a hashtable/trie, but when I pass in parameters to murmurhash2, it gives back a number but I get run time errors of unsigned int overflow: test.c:53:12: runtime error: unsigned integer overflow: 24930 * 1540483477…
Jeff Guan
  • 13
  • 4
1
vote
1 answer

Python pip SpaCy Installation Error with C++ and Murmurhash

EDIT: see the comments for the correct answer. Hi Guys here is a problem I have been having that deals with installing the NLP program SpaCY. I tried both pip install -U spacy and pip install spacy, but I seem to get the same error. I tried this on…
Kevin
  • 391
  • 3
  • 6
  • 22
1
vote
0 answers

values for MAD compression method?

I am stuck trying to implement the perfect hashing technique using universal hashing at each level from Cormen. Specifically, with the compression method (at least, I think here is my problem). I am working on strings, I think short strings (between…
germelcar
  • 56
  • 6
1
vote
1 answer

Generating long unique id with Murmur3 from google guava

at the moment i am trying to generate unique identifiers of type long on the client side. I have a parent/child relationship where the parent already has a UUID as identifier. I want to consider the Parent-UUID for calculating a Child-Id of type…
1
vote
1 answer

is it possible for MurmurHash3 to produce a 64 bit hash where the upper 32 bits are all 0?

Looking at https://github.com/aappleby/smhasher/blob/master/src/MurmurHash3.cpp I don't think so but I wanted to check. The situation is this, if I have a key of 1,2,3 or 4 bytes, is it reliable to simply take the numeric value of those bytes…
zcourts
  • 4,863
  • 6
  • 49
  • 74
1
vote
1 answer

Python: Python.h file missing

I am using Ubuntu 16.04. I am trying to install Murmurhash python library but it is throwing error: command 'x86_64-linux-gnu-gcc' failed with exit status 1 . I looked on Internet and it is says that this error is due to missing python header files.…
rombi
  • 199
  • 3
  • 22
1
vote
3 answers

C++ What should we pass in MurmurHash3 parameters?

I am confused with what parameter should I provide for the MurmurHash3_x86_128(). The murmurhash3 code can be found https://github.com/aappleby/smhasher/blob/master/src/MurmurHash3.cpp. Method definition is given below. void MurmurHash3_x86_128 (…
rombi
  • 199
  • 3
  • 22
1
vote
1 answer

How does Murmurhash3_x86_128 work for data larger than 15 bytes?

I want to use MurmurHash3 in a deduplication system with no adversary. So Murmurhash3 will hash files, for instance. However I am having problems using it, meaning I am doing something wrong. Murmurhash3_x86_128() (source-code) function receives…
Leaurus
  • 376
  • 3
  • 13
0
votes
0 answers

the results of murmurhash in pyspark and local python are different

i use murmurhash to compute the hash value, but i got the results of murmurhash in pyspark and local python are different. local python: the hash value of 54958 is 5309672324031917724 pyspark: the hash value of 54958 is -878367076
Gaurav
  • 194
  • 8
0
votes
0 answers

MetroHash has different hashing than MurmurHash3 in Javascript?

So in when i use in Javascript murmurhash-native/stream like so: const murmur = require("murmurhash-native/stream"); const hash = murmur.createHash("murmurhash128"); hash.update('hash').digest('hex'); console.log(hash.digest("hex")); the…
Prome88
  • 125
  • 5
0
votes
0 answers

Python string hashing without collision

Is there any approach to implement hashing without any collisions in python 3? I am using mmh3 provided by mmh3 import mmh3 string = "/hjhfkhdf/jefhfueiow-/eflkjhfeiero-kk&/kerdfujelifjr(0kjlegjfejf/?/jdfkhe" mmh3.hash128(string) To avoid…
Vidya
  • 547
  • 1
  • 10
  • 26
0
votes
0 answers

Replication of python featurehashing does not work using java

I am trying to write a java method that replicates python FeatureHasher into Java alternative. https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.FeatureHasher.html Below is the python code. >>> from…
Tuhin Subhra Mandal
  • 473
  • 1
  • 5
  • 15
0
votes
1 answer

Does Murmurhash have collisions on 32-bit inputs?

Consider the standard Murmurhash, giving 32-bit output values. Suppose that we apply it on 32-bit inputs -- are there collisions? In other words, does Murmurmash basically encodes a permutation when applied to 32-bit inputs? If collisions exist, can…
M A
  • 209
  • 2
  • 7
0
votes
1 answer

How to generate hash of arbitrary length with MurmurHash3 32 bit

I am currently trying to hash a set of strings using MurmurHash3, since 32 bit hash seems to be too large for me to handle. I wanted to reduce the number of bit used to generate hashes to around 24 bits. I already found some questions explaining how…