4

Say we started with a text file like:

a 00 
b 01
c 10
d 11

00000001011011

The algorithm would be the typical one where you use the prefixes to build a Huffman tree, read in the encoded bits while traversing the tree until you reach a leaf, then returning the character in at that leaf.

Could someone explain how I would determine the running time and space complexity?

atkayla
  • 8,143
  • 17
  • 72
  • 132

2 Answers2

9

Basically there are three methods on a Huffman Tree, construction, encoding, and decoding. The time complexity may vary from each other.

We should first notice that (see Wikipedia [link]):

In many cases, time complexity is not very important in the choice of algorithm here, since n here is the number of symbols in the alphabet, which is typically a very small number (compared to the length of the message to be encoded); whereas complexity analysis concerns the behavior when n grows to be very large.

  1. The complexity of construction is linear (O(n)) if input probabilities are sorted, see this paper. In most cases, we use a greedy O(n*log(n)) construction method: http://www.siggraph.org/education/materials/HyperGraph/video/mpeg/mpegfaq/huffman_tutorial.html
  2. If you build a bidirection hashtable for all symbols, both encoding and decoding would be constant (O(1)).
Tao HE
  • 300
  • 1
  • 5
4

Assume an encoded text string of length n and an alphabet of k symbols.

For every encoded symbol you have to traverse the tree in order to decode that symbol. The tree contains k nodes and, on average, it takes O(log k) node visits to decode a symbol. So the time complexity would be O(n log k).

Space complexity is O(k) for the tree and O(n) for the decoded text.

Jim Mischel
  • 131,090
  • 20
  • 188
  • 351
  • Why is the time complexity is O(k)? In the worst case the output tree could be a full binary tree having k+k\2+k\4+...+1=O(k^2) nodes. – so.very.tired Nov 19 '15 at 20:52
  • @so.very.tired: Re-read my answer. I said that the *space* complexity is O(k). With k characters there are k nodes in the tree. – Jim Mischel Nov 20 '15 at 12:37
  • Sorry I've been meaning to ask: "Why is the space complexity is... etc...", I wrote "time" instead of "space" by mistake. In Huffman tree, letters are represented by *terminal* nodes (not just any node), so with k characters, there are k terminal nodes, which means the tree can, like I said, end up with O(k^2) nodes (terminal as well as internal). – so.very.tired Nov 21 '15 at 08:37