0

enter image description here

 A=c("f","t","t","f","t","f","f","f","t","f")
    B=c("t","t","t","t","t","f","f","f","t","t")
    class=c("+","+","+","-","+","-","-","-","-","-")
    df=data.frame(A,B,class)
    df
       A B class
    1  f t     +
    2  t t     +
    3  t t     +
    4  f t     -
    5  t t     +
    6  f f     -
    7  f f     -
    8  f f     -
    9  t t     -
    10 f t     -

I partitioned attribute A or B due to the class as follows :

         {A}
       [T , F]          
    /         \                  
 -------     -------
 [3+,1-]     [1+,5-]





         {B}
       [T , F]          
    /         \                  
 -------     -------
 [4+,3-]     [0+,3-]

depending on the above formula I calculated entropy by this code in R .

1- for attribute A

t=table(A,class)
 t
   class
A   - +
  f 5 1
  t 1 3
 prop1=t[1,]/sum(t[1,])
 prop1
        -         + 
0.8333333 0.1666667 
 prop2=t[2,]/sum(t[2,])
  prop2
   -    + 
0.25 0.75 
 H1=-(prop1[1]*log2(prop1[1]))-(prop1[2]*log2(prop1[2]))
 H1

0.6500224 
 H2=-(prop2[1]*log2(prop2[1]))-(prop2[2]*log2(prop2[2]))
 H2

0.8112781 
 entropy=(table(A)[1]/length(A))*H1 +(table(A)[2]/length(A))*H2
 entropy

0.7145247 

2- for attribute B

t=table(B,class)
 t
   class
B   - +
  f 3 0
  t 3 4
 prop1=t[1,]/sum(t[1,])
 prop1
 - + 
 1 0 
 prop2=t[2,]/sum(t[2,])
 prop2
        -         + 
0.4285714 0.5714286 
 H1=-(prop1[1]*log2(prop1[1]))-(prop1[2]*log2(prop1[2]))
 H1 
NaN

 H2=-(prop2[1]*log2(prop2[1]))-(prop2[2]*log2(prop2[2]))
 H2         
0.9852281 

 entropy=(table(B)[1]/length(B))*H1 +(table(B)[2]/length(B))*H2
 entropy 
    NaN 

when I calculate entropy for attribute B the result give me NaN that is due to zero(0) (log2(0) is error ) . in such situation how can I fix this error or how can make H1 give me zero instead of NaN

zhyan
  • 261
  • 4
  • 14

0 Answers0