2

While defining the constructor for a HashSet

HashSet<Integer> hs = new HashSet<Integer>(10,(double)0.50);

The second argument called the "Fill Ratio" has a default value of 0.75.

I wanted to know whether there is a logical reason behind defaulting it to 0.75.

Chaipau
  • 199
  • 1
  • 5
  • 14
  • Read `hashset`'s doc . – passion Sep 22 '16 at 06:40
  • 1
    Hashtables suffer from _hash collisions_ and higher load factor (more common term than "fill ratio") means more collisions. This reduces both insertion and lookup performance. Any factor you choose will give you some tradeoff between space and time, and 0.75 is an empirically chosen value. It's good for the "separate chaining" hashtable design; the _open-addressed_ design is much more sensitive to collisions and needs a lower load factor (0.70 is the maximum useful value). – Marko Topolnik Sep 22 '16 at 07:12

2 Answers2

2

There is indeed logical reasoning behind the choice. If we understand HashSet is backed by a HashMap and recognize the constructor in your post calls a HashMap constructor:

public HashSet(int initialCapacity, float loadFactor) {
    map = new HashMap<>(initialCapacity, loadFactor);
}

and then proceed to related HashMap documentation we can see the logical reasoning behind the important choice.

As a general rule, the default load factor (.75) offers a good tradeoff between time and space costs. Higher values decrease the space overhead but increase the lookup cost (reflected in most of the operations of the HashMap class, including get and put). The expected number of entries in the map and its load factor should be taken into account when setting its initial capacity, so as to minimize the number of rehash operations. If the initial capacity is greater than the maximum number of entries divided by the load factor, no rehash operations will ever occur.

ChiefTwoPencils
  • 13,548
  • 8
  • 49
  • 75
1

A HashSet is backed by a HashMap so you can refer to HashMap's javadoc for the rationale behind the choice of 0.75 as a default value:

As a general rule, the default load factor (.75) offers a good tradeoff between time and space costs. Higher values decrease the space overhead but increase the lookup cost (reflected in most of the operations of the HashMap class, including get and put).

assylias
  • 321,522
  • 82
  • 660
  • 783