8

In TreeSet there is a method called contains that returns true if an element is in the set. I assume that this method uses binary search and does not iterate through all the elements in ascending order. Am I right?

I have a TreeSet that contains objects of a class that uses two String instance variables to distinguish it from other objects of the same class. I want to be able to create a method that searches the TreeSet by comparing the objects two instance variables (using get methods of course) with two other String variables and if they are equal, return the element. If the instance variables are less than go to the first element in the right subtree or if they are greater search in the left subtree etc. Is there a way to do this?

I know I could just store the objects in an ArrayList and use binary search to find the object, but this wouldn't be as fast as just searching the TreeSet.

exent
  • 83
  • 1
  • 2
  • 4
  • How do you know binary search in an `ArrayList` isn't as fast? Have you tried? – Fred Foo Apr 05 '11 at 21:31
  • I mean passing the elements from the TreeSet to a new ArrayList every time I need to search for an element and return it is slow. – exent Apr 05 '11 at 21:40
  • ah, yes, that would definitely be very slow. But if you first build the set, then search it multiple times, then sorting and binary searching an `ArrayList` might turn out to be quite fast. – Fred Foo Apr 05 '11 at 21:50
  • What if one instance variable is greater than its counterpart and the other is less than its counterpart? How should that sort? – Carl Manaster Apr 06 '11 at 02:54
  • @Carl Manaster: The Objects are sorted first by one of the Strings and then the other one. Just like a list of names would be sorted first on the last name and then the first name. – exent Apr 06 '11 at 07:05

5 Answers5

15
set.tailSet(obj).first();

does what you want.

Tomasz Nurkiewicz
  • 334,321
  • 69
  • 703
  • 674
greg
  • 159
  • 1
  • 3
6

Rather than using a TreeSet, you could store your objects in a TreeMap<Foo, Foo> or TreeMap<FooKey, Foo> (if you can't easily create a new actual Foo each time you want to search). Sets are not really intended for lookup.

For the FooKey example, FooKey would be a simple immutable class that just contains the two Strings and is Comparable. Finding the value of Foo for two Strings would then be a simple matter of treeMap.get(new FooKey(firstString, secondString)). This does of course use the tree traversal you want to find the value.

ColinD
  • 108,630
  • 30
  • 201
  • 202
2

You should either implement Comparable on your object or create a separate Comparator class that you pass in at the time TreeSet is constructed. This allows you to interject your custom entry comparison logic and let the TreeSet do its optimized store/search thing.

Konstantin Komissarchik
  • 28,879
  • 6
  • 61
  • 61
1

One thing I was wondering is why you want to search into a sorted set? If you want to be able to iterate in order as well as lookup quickly you may benefit from storing your objects in two separate data structures. One like your SortedSet<Foo> and then also a HashMap<FooKey,Foo> similar to what ColinD mentioned. Then you get constant time lookups instead of log(n) on the TreeMap. You have a write penalty of having to write to both structures, and a memory resource penalty of having the two data structures, but you have fully optimized your access to the data.

Also if memory resources are constrained, and your strings are really what differentiate the objects, then you can just implement hashcode() and equals() on your object Foo and then just use them as both the key and value (like HashMap<Foo,Foo>. The caveat there is that you have to construct a Foo to call the getter.

Justin Waugh
  • 3,975
  • 1
  • 21
  • 14
  • Maintaining two structures containing the same data is something I'd avoid if possible, not so much because of memory or performance costs of writing to both but because of the necessity of ensuring in your code that both structures are kept in sync at all times. That said, using a `HashMap` for lookups is certainly preferable. The OP didn't say why they needed a sorted structure, so perhaps they actually don't and just thought that would be the most efficient. – ColinD Apr 06 '11 at 16:44
0

You got the answer about using comparable/comparator, but I thought I would add that you are right that contains() does a binary search, though you shouldn't need to know those details

Justin Waugh
  • 3,975
  • 1
  • 21
  • 14
  • I need to know these details because I want to choose the most effective method possible. – exent Apr 05 '11 at 21:43
  • 1
    You generally should not know the details of how a particular implementation of an interface in the JDK is implemented. You cannot guarantee that it will always be that way. That is why it is behind an interface. – Justin Waugh Apr 05 '11 at 22:06
  • But in this case I didn't know if binary search was used or sequential search. That is a big difference. – exent Apr 05 '11 at 22:11
  • @exent: You should look at the performance guarantees that the JDK collections make. For example, `TreeMap` states "This implementation provides guaranteed log(n) time cost for the `containsKey`, `get`, `put` and `remove` operations". It's the log(n) time you want, not binary search (binary search specifically applies to sorted index-based structures like arrays and Lists, not trees). – ColinD Apr 05 '11 at 22:22
  • @ColinD I thought it was also called binary search for the tree search method I described. – exent Apr 05 '11 at 22:39
  • @exent: The tree itself is a kind of binary search tree. I guess you could refer to searching in it as "binary search", but it seems slightly weird to me since when you search you're just traversing the already built structure of the tree in a very straightfoward manner. – ColinD Apr 05 '11 at 23:21