0

These are the shapes of my features and target variables.

(1382, 1785, 2) (1382, 2)

The target here has two labels and each label has the same 28 classes. I have a CNN network as follows:-

model.add(Conv1D(100,5, activation='relu', input_shape=(1785,2)))
model.add(MaxPooling1D(pool_size=5))
model.add(Conv1D(64,10, activation='relu'))
model.add(MaxPooling1D(pool_size=4))
model.add(Flatten())
model.add(Dense(512, activation='relu'))
model.add(Dense(256, activation='relu'))
model.add(Dense(28, activation='softmax'))

When I use one hot encoded targets (1382,28) and categorical crossentropy loss function, the model runs fine and gives no errors.

But when I use sparse targets (1382,2) and sparse categorical crossentropy loss function, I run into the following error.

logits and labels must have the same first dimension, got logits shape [20,28] and labels shape [40]
 [[node sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits (defined at \AppData\Local\Temp/ipykernel_9932/3729291395.py:1) ]] [Op:__inference_train_function_11741]

From what I have seen from the people who have posted the same problem, seem to be using sparse categorical crossentropy for one hot encoded target variables.

I think that there is some problem with the shapes of the batches maybe. The shape of the logit changes to [x,28] where x is the batch size. Another thing that could be a problem is that I have two labels, but have no leads on how to troubleshoot the problem from there.

Any help is highly appreciated.

Sharan Kumar
  • 139
  • 1
  • 2
  • 13

1 Answers1

0

If you are using SparseCategoricalCrossEntropy as your loss function, you need to make sure that each data sample in your data belongs to one class ranging from 0 to 27. For example:

samples = 25
labels = tf.random.uniform((25, ), maxval=28, dtype=tf.int32)
print(labels)
tf.Tensor(
[12  7  1 13 22 14 26 13  6  1 27  1 11 18  5 18  5  6 12 14 21 18 17 12
  5], shape=(25,), dtype=int32)

Consider the shape of labels, it is neither (25, 2) nor (25, 28), but rather (25,) which will work with SparseCategoricalCrossEntropy.

AloneTogether
  • 25,814
  • 5
  • 20
  • 39
  • So is there any way to use Sparse categorical crossentropy if my labels are of the form (x,2)? I can't convert the labels to (x, ) as I need both the labels to be predicted. – Sharan Kumar Dec 07 '21 at 16:37
  • Then SparseCategoricalCrossEntropy is not the right solution for your use case. It does not make any sense. You are dealing with a multi-label problem. Why can't you use categorical crossentropy? – AloneTogether Dec 07 '21 at 16:38
  • I have used categorical crossentropy. I was just wondering if I could use sparse categorical crossentropy for multilabel problems. I understand why it can't be used for multilabel now. Thanks – Sharan Kumar Dec 09 '21 at 05:41