8

I'm looking for a Java class with the characteristics of C++ std::map's usual implementation (as I understand it, a self-balancing binary search tree):

  1. O(log n) performance for insertion/removal/search
  2. Each element is composed of a unique key and a mapped value
  3. Keys follow a strict weak ordering

I'm looking for implementations with open source or design documents; I'll probably end up rolling my own support for primitive keys/values.

This question's style is similar to: Java equivalent of std::deque, whose answer was "ArrayDeque from Primitive Collections for Java".

Community
  • 1
  • 1
Rudiger
  • 6,634
  • 9
  • 40
  • 57
  • Why do you need a tree that holds primitives? Why not just use a TreeMap (which is a red-black tree)? If you find a tree that can hold primitives, there would still be a sort of Node class: ie. it wouldn't be just primitives you're instantiating. – Bart Kiers Feb 13 '10 at 18:26
  • Do you need primitive _keys_ or primitive _values_? If values, I agree with Bart: just use `TreeMap` (or CSLM or whatever) and live with the boxing (since you already have to box into nodes to get it into the tree in the first place). If you just want primitive keys, the Scala library has `scala.collection.immutable.IntMap` and similar for `LongMap`, which take `Int` (i.e. java `int`) or `Long` (i.e. java `long`) as primitive keys. You can use the Scala library without too much effort from Java--but why even use that much effort if the standard library is adequate? – Rex Kerr Feb 13 '10 at 18:51
  • `@Bart K.:` I'm using primitive keys and values, and for most applications, collections which store primitives directly will require less space and yield significant performance gains. And to answer your second point, a Node class is not required; popular data structures generally have array implementations. – Rudiger Feb 13 '10 at 18:52
  • Take a look at the latest edit of this question; primitive keys/values isn't really a requirement for this question, which should open up the possibility for more answers. – Rudiger Feb 13 '10 at 18:58
  • 1
    @Rudiger, an array is also an Object in Java. But I shouldn't worry about performance that much at first. At least not before you have profiled your application and confirmed that much of the time is spent adding (or removing) stuff in your tree. – Bart Kiers Feb 13 '10 at 19:10
  • `@Bart K.:` Yes, an array is an object in Java. What's your point? An array-based implementation wouldn't use an array for _every_ key/value pair. However, you're correct about profiling before optimizing. My application maintains two of these maps inside each of ~10,000 objects, and does billions of add/update/remove/search operations altogether. It basically spends _all_ of its time reading a file and modifying these maps. – Rudiger Feb 13 '10 at 19:24
  • And you know up front that a TreeMap does not scale? (after reading your comments, I probably misunderstood your array remark btw) – Bart Kiers Feb 13 '10 at 20:43
  • What is wrong with Alex Miller's answer? – Michael Myers Mar 10 '10 at 21:32
  • @Rudiger Try reading your file in one thread and modify your map in another. I doubt that the map will be the bottleneck. And while it's not strictly related, here is a simple attempt to put some light on TreeMap's performance: http://stackoverflow.com/questions/2430962/data-structure-behind-amazon-s3s-keys-filtering-data-structure/2431634#2431634 – sfussenegger Mar 12 '10 at 23:55

3 Answers3

8

ConcurrentSkipListMap is a sorted map backed by a skip list (a self-balancing tree-like structure with O(log n) performance). Generally the bounds on CSLM are tighter than TreeMap (which is a self-balancing red-black tree impl) so it will probably perform better, with the side benefit of being thread-safe and concurrent, which TreeMap is not. CSLM was added in JDK 1.6.

Trove has a set of collections for primitive types and some other interesting variants of the common Java collection types.

Other collection libraries of interest include the Google Collection library and Apache Commons Collections.

Mizux
  • 8,222
  • 7
  • 32
  • 48
Alex Miller
  • 69,183
  • 25
  • 122
  • 167
  • 1
    TreeMap (ie. red-black tree) actually is the answer (although I rewrote a version of rbtree it to use primitives and arrays)... Sorry guys; I was kinda drunk when I asked this and didn't realize the answer was right under my nose. – Rudiger Mar 13 '10 at 22:53
5

The closest class to a binary tree in the standard Java libraries is java.util.TreeMap but it doesn't support primitive types, except by boxing (i.e. int is wrapped as an Integer, double as a Double, etc).

java.util.HashMap is likely to give better performance for large maps. Theoretically it is O(1) but its precise performance characteristics depend on the hash code generation algorithm(s) for the key class(es).

According to Introduction to Collections: "Arrays ... are the only collection that supports storing primitive data types."

richj
  • 7,499
  • 3
  • 32
  • 50
2

You can take a look at commons-collections FastTreeMap as well.

I doubt you will find many collections that support primitive types without boxing, so just live with it. And that is not necessarily needed, because of autoboxing.

If you really want to use primitive (after making benchmarks that show insufficient performance!), you can see the source of the FastTreeMap and add methods for handling primitives.

Bozho
  • 588,226
  • 146
  • 1,060
  • 1,140