Given a text of n characters and a Binary tree, generated by Huffman coding, such that the leaf nodes have attributes: a string (the character itself) and an integer (its frequency in the text). The path from the root to any leaf represents its codeword.
I would like to write a recusive function that calculates the length of the compressed text and find its Big O-complexitiy.
So for instance, if I have text
abaccab
and each character has associated frequency and depth in Huffman tree:
4
/ \
a:3 5
/ \
b:2 c:2
then the overall length of compressed text is 11
I came up with this, but it seems very crude:
def get_length(node, depth):
#Leaf node
if node.left_child is None and node.right_child is None:
return node.freq*depth
#Node with only one child
elif node.left_child is None and node.right_child is not None:
return get_length(node.right_child, depth+1)
elif node.right_child is None and node.left_child is not None:
return get_length(node.left_child, depth+1)
#Node with two children
else:
return get_length(node.left_child, depth+1) + get_length(node.right_child, depth+1)
get_length(root,0)
Complexity: O(log 2n) where n is the number of characters.
How can I improve this? What would be the complexity in this case?