3

I understand how to create Huffmann trees when the frequencies are different to each other but how would I draw this huffmann tree if few of the frequencies are the same:

simple explanation of the Huffmann trees is found here

The data of the Huffmann tree I am trying to create:

Letter Frequency
A       15%
B       15%
C       10%
D       10%
E       30%
F       20%

Now I start with the two lowest frequencies which are for Letter C and D

   .
  / \
 C   D

But what would be the next step? because we have A and B with the same frequencies so which one do we choose? If we choose one of them, then how will the structure look when the second one is chosen?

If I choose B then it will look like this (unless I am wrong)

     .
    / \
   B   .
      / \
     C   D

What about after this step???

These can be coded in Java and C as well and I am trying to figure out how these work first before implementing them.

EDIT

My tree looks like this:

         ___________|_________________
        /\                            |
       /  \                           |
      F    E                          |
     / \                              |
    /   \                             |
   B     A                           /\
                                    /  \
                                   C    D

Also got an example from online

enter image description here

Vadim Kotov
  • 8,084
  • 8
  • 48
  • 62
Bic B
  • 201
  • 2
  • 6
  • 18
  • You always pick the two lowest frequencies, so your second step is wrong. You don't pick CD and B (20% and 15% respectively) -- you pick A and B (15% and 15%). For this particular set of frequencies, there is never ambiguity in picking the lowest two. However that can happen. You can have sets of frequencies with several different trees with different topologies. However all of them have exactly the same average number of bits with the frequencies applied and all are optimal. – Mark Adler Aug 14 '12 at 15:23

3 Answers3

3

Step-by-step answer to your problem.

So you start with

A = 15%  
B = 15% 
C = 10% * 
D = 10% *
E = 30%
F = 20%

You pick two lowest (C+D) and join them (their sum is 20.

  20
 / \
C   D

You now have

A = 15%  *
B = 15%  *
C+D = 20% 
E = 30%
F = 20%

Now you join another two lowest (A, B) which sums to 30.

      20      30
     / \     / \
    C   D    A  B

You now have

A+B = 30%  
C+D = 20% *
E = 30%
F = 20%   *

Lowest are (C+D, F), so you join them

    40
   /  \      
  F   20      30
     / \     / \
    C   D    A  B


A+B = 30% *
C+D+F = 40% 
E = 30% *

Next step, same as before:

A+B+E = 60% *
C+D+F = 40% *


        100
       /   \
    40        60
   /  \      /  \
  F   20    E    30
     / \        / \
    C   D       A  B
KadekM
  • 1,003
  • 9
  • 13
2

you will be have the some code for any equal frequancy.

|     letter      |  A  |  B  |  C  |  D  |  E  |  F  |
|-----------------|-----|-----|-----|-----|-----|-----|
|      freq       |  10 |  20 |  30 |  5  |  25 |  10 |
|-----------------|-----|-----|-----|-----|-----|-----|

sort by max

|-----------------|-----|-----|-----|-----|-----|-----|
|     letter      |  C  |  E  |  B  |  F  |  A  |  D  |
|-----------------|-----|-----|-----|-----|-----|-----|
|      freq       |  30 |  25 |  20 |  10 |  10 |  5  |
|-----------------|-----|-----|-----|-----|-----|-----|

tree creating

freq           30    10     5     10     20     25
symbol          C     A     D      F      B      E
                      |     |
                      |--|--|
                        ||-|
                        |15|  = 5 + 10

2 step

freq          30    10     5     10     20     25
symbol         C     A     D      F      B      E
                     |     |      |
                     |     |      |
                     | |--||      |
                     |-|15||      |
                       ||-|       |
                        |         |
                        |    |--| |
                        |----|25|-| = 10 + 15
                             |--|

3 step

freq         30    10     5     10     20     25
sym          C     A     D      F      B      E
             |     |     |      |      |      |
             |     |     |      |      |      |
             |     | |--||      |      |      |
             |     |-|15||      |      |      |
             |       ||-|       |      |      |
             |        |         |      |      |
             |        |    |--| |      | |--| |
             |        |----|25|-|      |-|45|-|
             |             ||-|          ||-|
             |    |--|      |             |
             |----|55|------|             |
                  |-||                    |
                    |   |------------|    |
                    |---| Root (100) |----|
                        |------------|

encoding:

   C = 00   
   A = 0100 
   D = 0101 
   F = 011  
   B = 10   
   E = 11   
Dmitry Zagorulkin
  • 8,370
  • 4
  • 37
  • 60
  • I understand step 1 and 2 but dont understand what you did for step 3. why did you include B and E as seperate?? – Bic B Aug 14 '12 at 11:07
  • @AmberArroway you should sum close freqs. hence: `1 (10 and 5 = 15) `2 (15 and 10 = 25)` `3 (25 and 30)` and ` 4: (20 and 25)` `5 (55 and 45)` I didt separated 3 and 4 step. – Dmitry Zagorulkin Aug 14 '12 at 11:11
  • Let me post the tree I have drawn, can you tell me where I have gone wrong please? – Bic B Aug 14 '12 at 11:13
  • @AmberArroway could you ask what you didt understood? i will try to answer on your question. – Dmitry Zagorulkin Aug 14 '12 at 11:15
  • we choose two lightest-weight trees (choose any if there are more than two). Merge the two chosen trees into a single tree with a new root node whose left and right sub-trees are the two we chose. The weight of the new tree is the sum of the weights of the merged trees??? – Bic B Aug 14 '12 at 11:17
  • it may be not two. the main idea of the algorithm is that you should sum closest pairs of frequances. >>The weight of the new tree is the sum of the weights of the merged trees>> yes – Dmitry Zagorulkin Aug 14 '12 at 11:24
  • Can you draw a picture of how my tree would look? I kind of understand but the examples are different =( – Bic B Aug 14 '12 at 11:27
1

It doesn't really matter which you choose for, you will get a bit different encoding, but with same probabilities. There are more possible ways to build tree in some cases, but it doesn't matter.

I've edited the image because I made a mistake, check out my second answer for correct one though.

KadekM
  • 1,003
  • 9
  • 13
  • Another answer is different to yours...which one is right haha – Bic B Aug 14 '12 at 11:00
  • I am pretty sure this one is correct. I've had it at uni+exam, unless I made some stupid mistake (but I checked it), but the way I created it is correct. If you were to follow Lajos Arpad's way you would end up always with such deep tree, which I believe is not correct. – KadekM Aug 14 '12 at 11:02
  • His one looks different to mine even, Ill include the tree I have drawn – Bic B Aug 14 '12 at 11:10
  • The way to create tree is very simple, you always pick two subtrees with lowest probability and join them. At start you have 7 subtrees. Once you start connecting them into bigger tree, their probability sum at their parent. – KadekM Aug 14 '12 at 11:13
  • have a look at the image i uploaded – Bic B Aug 14 '12 at 11:19
  • my one was wrong, check out my answer, where it's step by step. Your problem is, A+B can't have a parent F. Rule of thumb is, all nodes (A...F) you posted have to be leaves in huffman tree, – KadekM Aug 14 '12 at 11:42