Sparse categorical entropy loss becomes NaN without label encoding

Question

I'm building a classifier for predicting labels -1 and 1. When I encode the labels by one hot encoder and use categorical cross entropy, I don't have any problems with learning.

model1.add(Dense(2, activation='softmax'))
model1.compile(loss='categorical_crossentropy', optimizer=optim, metrics=['accuracy'])

When I keep the labels without encoding and try to use sparse categorical cross entropy, the model loss is NaN.

model1.add(Dense(2, activation='softmax'))
model1.compile(loss='sparse_categorical_crossentropy', optimizer=optim, metrics=['accuracy'])

Epoch 1/3000
20748/20748 [==============================] - 2s 78us/sample - loss: nan - accuracy: 0.0000e+00 - val_loss: nan - val_accuracy: 0.0000e+00

When I encode the labels with ordinal encoder to be 0,1 instead -1,1 and train the model with sparse categorical entropy, the model doesn't show this problem with loss.

What is the reason for such a problem? I read that training with symmetrical labels around 0 is a problem but what is the explanation for that?

I know my Problem is binary classification but I wanted to test my model on binary data before using it for all labels

Stat Tistician · Answer 1 · 2020-07-30T10:57:58.887

0

SparseCategorialCrossentropy

expect labels to be provided as integers

and using SparseCategoricalCrossentropy integer-tokens are converted to a one-hot-encoded label starting at 0. So it creates it, but it is not in your data. So having two classes you need to provide the labels as 0 and 1. And not -1 and 1. Therefore it is as you write, you can either:

Run it with one-hot encoding using Categorical Crossentropy or
Run it with integer labels 0 and 1 using Sparse Categorical Crossentropy,

but when

Running it with labels -1 and 1 using Sparse Categorical Crossentropy it will fail.

edited Jul 30 '20 at 10:57

answered Jul 30 '20 at 10:47

Stat Tistician

813
5
17
45

so the problem is the negative sign not the symmetry around 0? – Hishi51 Aug 01 '20 at 07:49
Here in this case yes. – Stat Tistician Aug 01 '20 at 15:40

score 0 · Answer 2 · answered Dec 31 '20 at 10:48

0

Another case I observed in my case was that the labels I passed were in float format. As the document reads you need to make sure this is an integer and not float. This fixed my issue

answered Dec 31 '20 at 10:48

SecretAgent

97
10

score 0 · Answer 3 · answered Mar 31 '21 at 09:56

Missing labels between 0-num_classes causes NAN for sparse_categorical_crossentropy.

A quick hack, if you would like to use sparse categorical entropy in these situations, add just one sample each in training and testing datasets for each missing labels.

For images, you can change/edit existing training/testing sample label to missing label and execute the fit function. You will start seeing loss instead of NAN.

Accordingly number of neurons in last/top layer also need to be changed.

Sparse categorical entropy loss becomes NaN without label encoding

3 Answers3