2

I'm creating an std::unordered_map which I will immediately proceed to populate with n key-value pairs - and I know n. After that no more elements will be added - I will only be performing lookups.

What, therefore, should I pass as bucket_count to the constructor?

Notes:

Community
  • 1
  • 1
einpoklum
  • 118,144
  • 57
  • 340
  • 684
  • It very much depends on what you will do with this map afterwards. Will you add other elements, or just read it ? Do you seek speed or space efficiency ? How much collisions does you hash function yield on your set ? It is reasonable to think about the load factor rather than the actual number of buckets. – Jean-Bernard Jansen Dec 16 '16 at 11:13
  • @Jean-BernardJansen: See edit. Also, I'd like some sort of a reasonable default - the same way we now have a reasonable default without knowing n. Adding that information, and applying the same considerations, should yield some number... – einpoklum Dec 16 '16 at 11:21

2 Answers2

1

According to n4296 in 23.5.4.2 [unord.map.cnstr] (this is the final draft of C++14) by default the max_load_factor for an unordered_map is 1.0, so you could just set the bucket_count to n.

There is obviously a space-time trade-off between increasing the bucket count for improved speed and decreasing it (and raising the max load factor) for improved space.

I would either not worry about it, or if it is a large map, set the bucket count to n. Then you can worry about optimizing when profiling shows you have a problem.

If you know the range of load factors you want, then you just set the bucket count to std::ceil(n/(std::max(f_1,f_2)), (and set the load factor before you fill the map).

0

Given the fact you have a range for your load factor, the only missing information is the collision rate. You can simply use nb_buckets = n / f_2 and you will be sure to get a load factor less than or equal to f_2. Ensuring correctness about f_1 requires data about the collision rate.

Jean-Bernard Jansen
  • 7,544
  • 2
  • 20
  • 17