I have a basic question about calculating the entropy of a split.
Assumed I have a set with 2 classes, yes and no. In this set I have 3 samples Yes and 2 samples No.
If I calculate the entropy of this set I obtain:
-(2/5)*(log(2/5)/log(2))-(3/5)*(log(3/5)/log(2))=0.9710
Now, that gets me confused. If the entropy would be zero I would have only samples of one class. If the entropy is 0.5 (for 2 classes) I have 50% Yes and 50% No samples. A value close to 1 tells me what exactly now?
A pointer please, I feel like I am not seeing the obvious here, but I don't understand when the entropy can reach 1?