Entropie (information theory) calculation

Question

I have a basic question about calculating the entropy of a split.

Assumed I have a set with 2 classes, yes and no. In this set I have 3 samples Yes and 2 samples No.

If I calculate the entropy of this set I obtain:

-(2/5)*(log(2/5)/log(2))-(3/5)*(log(3/5)/log(2))=0.9710

Now, that gets me confused. If the entropy would be zero I would have only samples of one class. If the entropy is 0.5 (for 2 classes) I have 50% Yes and 50% No samples. A value close to 1 tells me what exactly now?

A pointer please, I feel like I am not seeing the obvious here, but I don't understand when the entropy can reach 1?

score 0 · Answer 1 · answered Jul 19 '14 at 23:38

In a binary example such as yours, the entropy of the system will approach 1 if it is perfectly distributed to each of the possible outcomes (10 samples, 5 yes, 5 no). The further from this even distribution the closer to 0. You can see the binary entropy plot on wikipedia.

More specifically the perfect distribution of the entropy sum is log2(numClasses). So for 2 == log2(2) == 1.

score 0 · Answer 2 · answered Jan 15 '18 at 18:10

Entropy goes to 1 as the the number of observations with 50% probability of success in binomial distribution approaches to infinity.

For example,

c <- rbinom(100000,1,0.5)
freqsC <- table(c)/length(c)
entropyC <- -sum(freqsC * log2(freqsC))
entropyC
[1] 0.9999885

This is the entropy value with 100000 observations.

And here is the entropy value with 100000000 observations.

f <- rbinom(100000000,1,0.5)
freqsF <- table(f) / length(f)
entropyF <- -sum(freqsF * log2(freqsF))
entropyF
[1] 1

This is actually 0.999999969120836 but R gives it as 1.

Hope it helps.

Entropie (information theory) calculation

2 Answers2