3

I've got a unidirectional tree of objects, in which each objects points to its parent. Given an object, I need to obtain its entire subtree of descendants, as a collection of objects. The objects are not actually in any data structure, but I can easily get a collection of all the objects.

The naive approach is to examine each object in the batch, see if the given object is an ancestor, and keep it aside. This would not be too efficient... It carries an overhead of O(N*N), where N is the number of objects.

Another approach is the recursive one, meaning search for the object's direct children and repeat the process for the next level. Unfortunately the tree is unidirectional... there's no direct approach to the children, and this would be only slightly less costly than the previous approach.

My question: Is there an efficient algorithm I'm overlooking here?

Thanks,

Yuval =8-)

Yuval
  • 7,987
  • 12
  • 40
  • 54
  • Should the algorithm return a list of the descendant Nodes, or does it need to be ordered in a tree as well? I don't see how you can do the latter if Nodes only point to their ancestors (i.e. you couldn't return a root node). – matt b Oct 16 '08 at 16:18
  • That sounds more like a collection of interlinked lists than a tree – workmad3 Oct 16 '08 at 16:20
  • matt: Of course, this algorithm results in a collection... I edited the algorithm description accordingly. =8-) – Yuval Oct 16 '08 at 16:31

4 Answers4

3

Databases work the same way, so do what databases do. Build up a hashtable which maps from parent to list-of-children. That takes O(n). Then using that hashtable would make lookups and queries potentially be a lot more efficient.

yfeldblum
  • 65,165
  • 12
  • 129
  • 169
3

As others have mentioned, build a hashtable/map of objects to a list of their (direct) children.

From there you can easily lookup a list of direct children of your "target object", and then for each object in the list, repeat the process.

Here's how I did it in Java and using generics, with a queue instead of any recursion:

public static Set<Node> findDescendants(List<Node> allNodes, Node thisNode) {

    // keep a map of Nodes to a List of that Node's direct children
    Map<Node, List<Node>> map = new HashMap<Node, List<Node>>();

    // populate the map - this is O(n) since we examine each and every node
    // in the list
    for (Node n : allNodes) {

        Node parent = n.getParent();
        if (parent != null) {

            List<Node> children = map.get(parent);
            if (children == null) {
                // instantiate list
                children = new ArrayList<Node>();
                map.put(parent, children);
            }
            children.add(n);
        }
    }


    // now, create a collection of thisNode's children (of all levels)
    Set<Node> allChildren = new HashSet<Node>();

    // keep a "queue" of nodes to look at
    List<Node> nodesToExamine = new ArrayList<Node>();
    nodesToExamine.add(thisNode);

    while (nodesToExamine.isEmpty() == false) {
        // pop a node off the queue
        Node node = nodesToExamine.remove(0);

        List<Node> children = map.get(node);
        if (children != null) {
            for (Node c : children) {
                allChildren.add(c);
                nodesToExamine.add(c);
            }
        }
    }

    return allChildren;
}

The expected execution time is something between O(n) and O(2n), if I remember how to calculate that right. You're guaranteed to look at every node in the list, plus a few more operations to find all of the descendants of your node - in the worst case (if you run the algorithm on the root node) you are looking at every node in the list twice.

matt b
  • 138,234
  • 66
  • 282
  • 345
0

Your question is a little abstract, but nested sets (scroll down, might be a little too mysql-specific) might be an option for you. It's extremely fast for read operations, though any modifications are quite complex (and have to modify half the tree on average).

That requires the ability to modify your data structure, though. And I guess if you can modify the structure, you could just as well add references to child objects. If you can't modify the structure, I doubt there's anything faster than your ideas.

Matthias Winkelmann
  • 15,870
  • 7
  • 64
  • 76
0

Building a tree where the objects point to their immediate children would probably be the best approach, especially if you need to do future look-ups. Building the tree largely depends on the height of the original tree. At maximum, it would take O(n^2).

While you're building the tree, build a hashtable. The hashtable will make future searches for a particular object faster (O(1) vs. O(n)).

Ray Li
  • 2,311
  • 3
  • 17
  • 13