1

I have seen it suggested that you get better "load-balancing" within a cache by using the last characters of a hashed filename - it's what nginx does for example (proxy cache module). Can anyone explain why the last characters are used?

EDIT:

For example:

md5('asdf')
'912ec803b2ce49e4a541068d495ab570'
md5('asdg')
'7e6a6a87bf3ffb29a6dd9f14afdc3b88'

"seem" random enough.

jerd
  • 75
  • 6

1 Answers1

0

It's common to have lots of files that start with the same prefix. By reversing the name you can increase the randmoness.

Samuel Neff
  • 73,278
  • 17
  • 138
  • 182
  • 1
    This is especially true if the filename is a path. – nickm Nov 09 '10 at 14:35
  • 1
    But the file names are hashed. The question is why are the last characters of a *hash* any more random than the first? – jerd Nov 10 '10 at 06:09
  • @jerd, there are different types of hashing algorithms. md5/sha256/etc are used to generate a unique and consistent key from an input value for comparison. Cache hashing algorithms generate a single number from an input value used for placing cached items within a much much smaller number of buckets. Cache hashing algorithms are much less unique and less random. – Samuel Neff Nov 10 '10 at 11:34