When caching files based on hash characters, why use the last characters rather than the first?

Question

I have seen it suggested that you get better "load-balancing" within a cache by using the last characters of a hashed filename - it's what nginx does for example (proxy cache module). Can anyone explain why the last characters are used?

EDIT:

For example:

md5('asdf')
'912ec803b2ce49e4a541068d495ab570'
md5('asdg')
'7e6a6a87bf3ffb29a6dd9f14afdc3b88'

"seem" random enough.

score 0 · Answer 1 · answered Nov 09 '10 at 14:21

0

It's common to have lots of files that start with the same prefix. By reversing the name you can increase the randmoness.

answered Nov 09 '10 at 14:21

Samuel Neff

73,278
17
138
182

1

This is especially true if the filename is a path. – nickm Nov 09 '10 at 14:35
1

But the file names are hashed. The question is why are the last characters of a *hash* any more random than the first? – jerd Nov 10 '10 at 06:09
@jerd, there are different types of hashing algorithms. md5/sha256/etc are used to generate a unique and consistent key from an input value for comparison. Cache hashing algorithms generate a single number from an input value used for placing cached items within a much much smaller number of buckets. Cache hashing algorithms are much less unique and less random. – Samuel Neff Nov 10 '10 at 11:34

When caching files based on hash characters, why use the last characters rather than the first?

1 Answers1