Looking at the book Mining of Massive Datasets, section 1.3.2 has an overview of Hash Functions. Without a computer science background, this is quite new to me; Ruby was my first language, where a hash
seems to be equivalent to Dictionary<object, object>
. And I had never considered how this kind of datastructure is put together.
The book mentions hash functions, as a means of implementing these dictionary data structures. This paragraph:
First, a hash function h takes a hash-key value as an argument and produces a bucket number as a result. The bucket number is an integer, normally in the range 0 to B − 1, where B is the number of buckets. Hash-keys can be of any type. There is an intuitive property of hash functions that they “randomize” hash-keys
What exactly are buckets in terms of a hash function
? it sounds like buckets are array-like
structures, and that the hash function
is some kind of algorithm / array-like-structure
search that produces the same bucket number every time? What is inside this metaphorical bucket?
I've always read that javascript objects/ruby hashes/ etc don't guarantee order. In practice I've found that keys' order doesn't change (actually, I think using an older version of Mozilla's Rhino interpreter that the JS object order DID change, but I can't be sure...).
Does that mean that hashes (Ruby) / objects (JS) ARE NOT resolved by these hash functions
?
Does the word hashing
take on different meanings depending on the level at which you are working with computers? i.e. it would seem that a Ruby hash is not the same as a C++ hash...