22

I'm very confused by Scala's HashSet and Set types as they both seem to do the same thing.

  • What is the difference between them?
  • Is it the same in Java?
  • In my reference it says that HashSet is an "explicit set class" (as compared to Set). What does that mean?
Kristina
  • 15,859
  • 29
  • 111
  • 181
  • see also [another version of this question](http://stackoverflow.com/questions/18759913/what-is-the-difference-between-hashset-and-set-and-when-should-each-one), as well as a similar question for [`Map` vs `HashMap`](http://stackoverflow.com/questions/31685236/scala-map-vs-hashmap) – EthanP Mar 11 '16 at 06:50

1 Answers1

42

Scala's mutable and immutable HashSet implementations are concrete classes which you can instantiate. For example, if you explicitly ask for a new scala.collection.immutable.HashSet, you will always get a set which is implemented by a hash trie. There are other set implementations, such as ListSet, which uses a list.

Set is a trait which all the set implementations extend (whereas in Java, Set is an interface).

Set is also a companion object* with an apply** method. When you call Set(...), you're calling this factory method and getting a return value which is some kind of Set. It might be a HashSet, but could be some other implementation. According to 2, the default implementation for an immutable set has special representation for empty set and sets size up to 4. Immutable sets size 5 and above and mutable sets all use hashSet.


*In Scala, instead of having static class methods, you can create a singleton object with the same name as your class or trait. This is called a companion object, and methods you define on it can be called as ObjectName.method(), similar to how you'd call a static method in Java.

**Set(x) is syntactic sugar for Set.apply(x).

Zhihao
  • 5
  • 1
  • 3
Ben James
  • 121,135
  • 26
  • 193
  • 155
  • 4
    `It might be a HashSet, but could be some other implementation.` - but on which factors depends returned type of Set? Before I think for `Set` default implementation is `HashSet`, for `IndexedSeq` - Vector and so on.. – WelcomeTo Aug 07 '13 at 19:47