I'm using hash_ring
package for distributing objects among servers. I've assumed that distribution would be uniform, as it's based on MD5 hashes. Unfortunately it's not the case.
I'm using random keys which are generated using uuid.uuid4()
. I've verified, that MD5 itself in fact does give uniform distribution. However, when I'm distributing using hash_ring.HashRing
there are differences of 20-30% between most and least populated buckets.
- Can uniformity of distribution be improved in
hash_ring
by tweaking some settings? - Are there other good alternatives to do consistent hashing in Python?
The code I've used to test uniformity of distribution:
ring = hash_ring.HashRing(range(8))
for _ in range(10):
counters = [0]*8
for _ in range(100000):
counters[ring.get_node(str(uuid.uuid4()))] += 1
print counters
Which printed out:
[11115, 11853, 14033, 11505, 13640, 12292, 12851, 12711]
[11164, 11833, 14024, 11562, 13365, 12302, 13002, 12748]
[11354, 11756, 14017, 11583, 13201, 12231, 13135, 12723]
[11182, 11672, 13936, 11441, 13563, 12240, 13129, 12837]
[11069, 11866, 14045, 11541, 13443, 12249, 12894, 12893]
[11196, 11791, 14158, 11533, 13517, 12319, 13039, 12447]
[11297, 11944, 14114, 11536, 13154, 12289, 12990, 12676]
[11250, 11770, 14145, 11657, 13340, 11960, 13161, 12717]
[11027, 11891, 14099, 11615, 13320, 12336, 12891, 12821]
[11148, 11677, 13965, 11557, 13503, 12309, 13123, 12718]
For comparison I've did non-consistent hashing directly using MD5:
for _ in range(10):
counters = [0]*8
for _ in range(100000):
counters[int(hashlib.md5(str(uuid.uuid4())).hexdigest(),16)%8] += 1
print counters
with much better results:
[12450, 12501, 12380, 12643, 12446, 12444, 12506, 12630]
[12579, 12667, 12523, 12385, 12386, 12445, 12702, 12313]
[12624, 12449, 12451, 12401, 12580, 12449, 12562, 12484]
[12359, 12543, 12371, 12659, 12508, 12416, 12619, 12525]
[12425, 12526, 12565, 12732, 12381, 12481, 12335, 12555]
[12514, 12576, 12528, 12294, 12658, 12319, 12518, 12593]
[12500, 12471, 12460, 12502, 12637, 12393, 12442, 12595]
[12583, 12418, 12428, 12311, 12581, 12780, 12372, 12527]
[12395, 12569, 12544, 12319, 12607, 12488, 12424, 12654]
[12480, 12423, 12492, 12433, 12427, 12502, 12635, 12608]