0

For educational purposes I've been creating deep learning library for some time now. Few days ago I received a task for intern position to create a model from scratch using numpy which will classify digits from subset of MNIST dataset into 2 classes (0 - odd, 1 - prime). Everything was going well until the time has come to create a loss function. Because it is a binary classification problem, I've chosen binary crossentropy. There is an implementation :

def loss(self, target: np.ndarray, predicted: np.ndarray, epsilon=1e-7) -> np.ndarray:
        predicted = np.clip(predicted, epsilon, 1 - epsilon)
        predicted = np.log(predicted / (1 - predicted))
        return (target * -np.log(self.sigmoid(predicted)) +
                (1 - target) * -np.log(1 - self.sigmoid(predicted)))

Basically it is almost the same function which keras has for numpy backend. The output from this loss function with batch size 16 is as follows:

 [[1.61180957e+01]
 [1.00000005e-07]
 [1.00000005e-07]
 [1.61180957e+01]
 [1.00000005e-07]
 [1.61180957e+01]
 [1.61180957e+01]
 [1.00000005e-07]
 [1.61180957e+01]
 [1.61180957e+01]
 [1.00000005e-07]
 [1.61180957e+01]
 [1.61180957e+01]
 [1.00000005e-07]
 [1.61180957e+01]
 [1.00000005e-07]]

I have a strong doubts that they should not look like this. Maybe it is the problem with dataset which we had to refactor by ourself. To clarify the typical sample is just a 28x28 pixel values matrix and label is just a single number 0 or 1. The next problem occurs when I try to sum up loss for a whole epoch and save it to the something like Keras history object. Should I like sum up losses for every batch iteration and then divide it by number of sampled (which sound wrong to me) or have to properly calculate epoch loss?

Thanks for help in advance and stay safe and healthy!

Bearnardd
  • 37
  • 5

1 Answers1

0

I believe your current output is for a mini batch, otherwise your 'predicted' should be a single value and not an ndarray.

Also what do you mean by epoch loss? You should be computing loss for every minibatch which is the average loss as described.

Alberto MQ
  • 373
  • 2
  • 16
  • My bad, I was talking about mini batch, I just do not think that loss that I getting is right. My average loss is around 2 and when I do the same task with keras model I got around 200 so I suppose that my calculations at some point are wrong – Bearnardd Apr 10 '20 at 19:55