1

I am using Cross Entropy with Softmax as loss function for my neural network. The cross entropy function I have written is as follows:

def CrossEntropy(calculated,desired):
    sum=0
    n=len(calculated)
    for i in range(0,n):
        sum+=(desired[i] * math.log(calculated[i])) + ((1-desired[i])* math.log(1-calculated[i]))

    crossentropy=(-1)*sum/n
    return crossentropy

Now let us suppose the desired output is [1,0,0,0] and we are testing it for two calculated outputs i.e. a=[0.1,0.9,0.1,0.1] and b=[0.1,0.1,0.1,0.9]. The problem is that for both these calculated outputs will the function would return the exact same value for cross entropy. So how does the neural network learn that which output is the correct one ?

pr22
  • 179
  • 1
  • 2
  • 9

1 Answers1

0

That is expected because you have a data symmetry in your two calculated cases.

In your example, the desired output is [1, 0, 0, 0]. Thus the true class is the first class. However, in both a and b your prediction for the first class are the same (0.1). Also for other classes (true negatives - 2nd, 3rd and 4th class), you have this data symmetry (class 2 and class 4 are equally important with respect to the loss calculation).

 a -> 0.9,0.1,0.1
       ^
       |       |
               V
 b -> 0.1,0.1,0.9

Thus you have the same loss which is expected.

If you remove this symmetry, you get different cross entropy loss. See examples below:


# The first two are from your examples.
print CrossEntropy(calculated=[0.1,0.9,0.1,0.1], desired=[1, 0, 0, 0])
print CrossEntropy(calculated=[0.1,0.1,0.1,0.9], desired=[1, 0, 0, 0])

# below we have prediction for the last class as 0.75 thus break the data symmetry.
print CrossEntropy(calculated=[0.1,0.1,0.1,0.75], desired=[1, 0, 0, 0])

# below we have prediction for the true class as 0.45.
print CrossEntropy(calculated=[0.45,0.1,0.1,0.9], desired=[1, 0, 0, 0])


result:
1.20397280433
1.20397280433
0.974900121357
0.827953455132
greeness
  • 15,956
  • 5
  • 50
  • 80