8

In my code, Java TreeSet iteration is the dominant time factor. In looking at the system I believe that it is O(n) complexity. Can anyone verify this?

I am thinking that by providing links backward from child node to parent node I could improve the performance.

John Smith
  • 81
  • 1
  • 3
  • This question doesn't make sense. It sounds like you're saying iterating over your treeset is O(n). This is the best you can do for iteration - looking at n items requires O(n) time. If you want to make the code which is dominated by the iteration faster, you need to change the algorithm so it doesn't do iteration - for example, by doing lookups in the tree by key (which would be O(log n)) instead. – babbageclunk Aug 12 '10 at 10:28

2 Answers2

5

TreeSet iteration is of course O(n), as can be expect from any sensible tree-walking algorithm.

I am thinking that by providing links backward from child node to parent node I could improve the performance.

TreeMap (which TreeSet is based on) already has such parent references. This is the method it all boils down to:

private Entry<K,V> successor(Entry<K,V> t) {
    if (t == null)
        return null;
    else if (t.right != null) {
        Entry<K,V> p = t.right;
        while (p.left != null)
            p = p.left;
        return p;
    } else {
        Entry<K,V> p = t.parent;
        Entry<K,V> ch = t;
        while (p != null && ch == p.right) {
            ch = p;
            p = p.parent;
        }
        return p;
    }
}
Michael Borgwardt
  • 342,105
  • 78
  • 482
  • 720
  • I think the key point is that it is not only the leaf nodes that have values attached. For `n` leafs you do have `O(n log n)` nodes but also `O(n log n)` values. In real cases, I would expect `O(n)` operations to dominate `O(n log n)` operations anyway. – Tom Hawtin - tackline May 03 '10 at 16:30
  • I thought it would be O(n)+O(logn) because if you are iterating in order you have to first use log(n) traversals to find the left or rightmost leaf, and then you have n traversals to walk the tree backwards. Thus you have O(n) ? Does my math make sense? – John Smith May 03 '10 at 16:37
  • 2
    @john: O(n)+O(log n) is the same thing as O(n) per definition. – Michael Borgwardt May 03 '10 at 18:18
  • 2
    @Tom: O(n*log n) is strictly greater than O(n), but it's not true that you have O(n*log n) nodes for n leafs. For a binary tree with n leafs, you have n/2 nodes at the second-to-lowest level, n/4 at the next higher one... That's a geometric series and adds up to 2n, i.e. O(n). – Michael Borgwardt May 03 '10 at 18:23
  • Second part is true (but it's mistake which could clearly have been made!). Although O(n log n) is more stuff than O(n), for actual values of n, doing O(n log n) and O(n) may take longer in the latter. – Tom Hawtin - tackline May 03 '10 at 20:45
1

Have you considered taking a copy of the TreeSet when you alter it? If the dominate time is spent in TreeSet iteration (rather than modifying it) then copying the TreeSet to an array or ArrayList (only when altered) and only iterating over this array/ArrayList could almost elminate the cost of TreeSet iteration.

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130