0

I'm trying to create a function that will return all the leaves in my recursive tree. I saw many other posts about it but I couldn't modify it to my own code. I am trying to make like a decision tree. This is my code:

class Node:
def __init__(self, data, positive_child=None, negative_child=None):
    self.data = data
    self.positive_child = positive_child
    self.negative_child = negative_child
    self.children_list = []

class Decision:
    def __init__(self, root: Node):
        self.root = root
        self.current = root

    def collect_leaves(self, node, leafs):
        if node is not None:
            if len(node.children_list) == 0:
                leafs.append(node.data)
            for n in node.children_list:
                self.collect_leaves(n, leafs)

    def return_all_leaves(self):
        leafs = []
        self.collect_leaves(self.root, leafs)
        return leafs

from some reason it returns only the root, and not the leaves..

For example:

flu_leaf2 = Node("influenza", None, None)
cold_leaf2 = Node("cold", None, None)
hard_leaf2 = Node("hard influenza", None, None)
headache_node2 = Node("headache", hard_leaf2, flu_leaf2)
inner_vertex2 = Node("fever", headache_node2, cold_leaf2)
healthy_leaf2 = Node("healthy", None, None)
root2 = Node("cough", inner_vertex2, healthy_leaf2)
diagnoser2 = Diagnoser(root2)

diagnoser2.return_all_leaves(self) is supposed to return:

['hard influenza', 'influenza','cold','healthy']
David Buck
  • 3,752
  • 35
  • 31
  • 35

1 Answers1

0

I'm not sure you need the children_list. It seems like just an extra thing to maintain. I think it should be enough to determine if the node has a positive_child or a negative_child.

Note the addition of the __str__ method to Node to show something nice when printing...

Take a look at:

class Node:
    def __init__(self, data, positive_child=None, negative_child=None):
        self.data = data
        self.positive_child = positive_child
        self.negative_child = negative_child

    def __str__(self) -> str:
        return self.data

class Decision:
    def __init__(self, root: Node):
        self.root = root

    def collect_leaves(self, node):
        ## ---------------------------
        ## This node has no children. It is a leaf
        ## ---------------------------
        if not node.positive_child and not node.negative_child:
            return [node]
        ## ---------------------------

        ## ---------------------------
        ## Recursively collect the leaves of children
        ## ---------------------------
        leaves = []
        if node.positive_child:
            leaves.extend(self.collect_leaves(node.positive_child))
        if node.negative_child:
            leaves.extend(self.collect_leaves(node.negative_child))
        return leaves
        ## ---------------------------

    def return_all_leaves(self):
        return self.collect_leaves(self.root)

my_decsion = Decision(
    Node(
        "root",
        Node("root_a", None, None),
        Node(
            "root_b",
            Node("root_b_1", None),
            Node("root_b_2", None),
        ),
    )
)

for node in my_decsion.return_all_leaves():
    print(node)

That should give you:

root_a
root_b_1
root_b_2

Note though that recursion in python is a little limited and you might want to look at an implementation that is not based on it.

If you wanted a version of return_all_leaves() that was not based on recursion, you might try:

    def return_all_leaves2(self):
        leaves = []
        todo = [self.root]
        while todo:
            this_node = todo.pop(0)

            ## ---------------------------
            ## This node was None... I think this is cleaner than testing
            ## parent_node.positive_child and parent_node.negative_child
            ## ---------------------------
            if not this_node:
                continue
            ## ---------------------------

            ## ---------------------------
            ## This node has no children. It is a leaf
            ## ---------------------------
            if not this_node.positive_child and not this_node.negative_child:
                leaves.append(this_node)
                continue
            ## ---------------------------

            ## ---------------------------
            ## add the leaves of children to future work
            ## ---------------------------
            todo.append(this_node.positive_child)
            todo.append(this_node.negative_child)
            ## ---------------------------

        return leaves

If we test with your data:

my_decsion2 = Decision(
    Node(
        "cough",
        Node(
            "fever",
            Node(
                "headache",
                Node("hard influenza", None, None),
                Node("influenza", None, None)
            ),
            Node("cold", None, None)
        ),
        Node("healthy", None, None)
    )
)

for node in my_decsion2.return_all_leaves():
    print(node)

This code prints:

hard influenza
influenza
cold
healthy
JonSG
  • 10,542
  • 2
  • 25
  • 36
  • First of all thank you very much:) . Its returned me only their location, I mean this: [<__main__.Node object at 0x000001B63BF63A60>, <__main__.Node object at 0x000001B63BF63B20>, <__main__.Node object at 0x000001B63BF63AC0>, <__main__.Node object at 0x000001B63BF63940>] But the len is good. its supposed to return 4 items,but when I am trying to write node.positive_child.data,it gives me an error ('str' object has no attribute 'positive_child') – Zvia Green Dec 25 '21 at 22:16
  • Did you see I implemented the `__str__` method in `Node`? That allows you to do `print(some_node)` in a nice way rather than seeing what you see. – JonSG Dec 25 '21 at 22:20
  • hmm yes, I wrote it as you said. You are welcome to try if you want to see, I wrote the exact same example that I am using :) Also the second one the exact same problem. Again, much appreciation! You help is not obvious – Zvia Green Dec 25 '21 at 22:25
  • Take a look at the code I posted. It has some changes to both your classes. also, your example uses a class `Diagnoser` that I think might be something you did not post and should be `Decision` in the context of this question. With the test data you posted, my this code prints the 4 data names – JonSG Dec 25 '21 at 22:34
  • Amazing. Thank you so much!Its less important, but do you know why the return value of the function is still [<__main__.Node object at 0x000001B63BF63A60>, <__main__.Node object at 0x000001B63BF63B20>, <__main__.Node object at 0x000001B63BF63AC0>, <__main__.Node object at 0x000001B63BF63940>] ? – Zvia Green Dec 25 '21 at 22:55
  • It is returning Nodes. and you you want something nicer to look at you need to implement `__repr__` and `__str__` on Node :-) – JonSG Dec 25 '21 at 23:10