Multiclass classification using Categorical cross entropy - ValueError: Shapes (3, 1) and (1, 3) are incompatible

Question

I am building a multi-class classifier on 22500 images.

The label has 3 categories - 0,1,2

I've one hot encoded the y label as follows:

y_train = tf.one_hot(y_train,3)
y_test = tf.one_hot(y_test,3)
y_val = tf.one_hot(y_val,3)

Since the size of data is large, i am using a tf.data.Dataset object to preprocess the data. I have zipped the data and label using dataset.zip

#creating zipped tuples ofdata and label
data_set_train = tf.data.Dataset.zip((X_train , y_train))
data_set_test = tf.data.Dataset.zip((X_test , y_test))
data_set_val = tf.data.Dataset.zip((X_val , y_val))

and applied a preprocessing function


def pre_process(x,y):
     x_norm = (x - mean_Rot_MIP) / Var_Rot_MIP
     # Stacking along the last dimension to avoid having to move channel axis
     x_norm_3ch = tf.stack((x_norm, x_norm, x_norm), axis=-1)
     x_norm_3ch = tf.reshape(x_norm_3ch, [1,224,224,3])
     return x_norm_3ch , y


#creating dataset iterable with all transaformations
X_train1 = data_set_train.map(pre_process)
X_test1 = data_set_test.map(pre_process)
X_val1 = data_set_val.map(pre_process)

The dataset object contains a tuple of data tensor and y label tensor like:

(<tf.Tensor: shape=(1, 224, 224, 3), dtype=float64, numpy=
array([[[[-1.02143877, -1.02143877, -1.02143877],
         [-1.02143877, -1.02143877, -1.02143877],
         [-1.02143877, -1.02143877, -1.02143877],
...,
         [-1.02143877, -1.02143877, -1.02143877],
         [-1.02143877, -1.02143877, -1.02143877],
         [-1.02143877, -1.02143877, -1.02143877]]]])>,
 <tf.Tensor: shape=(3,), dtype=float32, numpy=array([1., 0., 0.], dtype=float32)>)

the shape of each input is: (1,224,224,3) shape of y label is:(3,)

I am using RESNET50 with few additional head layers to perform the classification

baseModel = ResNet50(weights=None, include_top=False, input_tensor=Input(shape=(224, 224, 3)))
headModel = baseModel.output
headModel = AveragePooling2D(pool_size=(7, 7))(headModel)
headModel = Flatten(name="flatten")(headModel)
headModel = Dense(256, activation="relu")(headModel)
headModel = Dropout(0.5)(headModel)
headModel = Dense(3, activation="softmax")(headModel)
model = Model(inputs=baseModel.input, outputs=headModel)

and using categorical_cross entropy as loss function.

# compile the model
INIT_LR = 1e-4
BS = 16
NUM_EPOCHS = 20

opt = Adam(lr=INIT_LR, decay=INIT_LR / NUM_EPOCHS)
model.compile(loss="categorical_crossentropy", optimizer=opt,metrics=["accuracy"])

# train the model
H = model.fit(X_train1, batch_size = BS, validation_data=(X_val1), epochs = NUM_EPOCHS, shuffle =False)

When i fit the model, i get the following error:

Traceback (most recent call last):

  File "<ipython-input-81-eda0da51ce9e>", line 1, in <module>
    H = model.fit(X_train1, batch_size = BS, validation_data=(X_val1), epochs = NUM_EPOCHS, shuffle =False)

  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\keras\engine\training.py", line 108, in _method_wrapper
    return method(self, *args, **kwargs)

  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\keras\engine\training.py", line 1098, in fit
    tmp_logs = train_function(iterator)

  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\eager\def_function.py", line 780, in __call__
    result = self._call(*args, **kwds)

  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\eager\def_function.py", line 814, in _call
    results = self._stateful_fn(*args, **kwds)

  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\eager\function.py", line 2828, in __call__
    graph_function, args, kwargs = self._maybe_define_function(args, kwargs)

  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\eager\function.py", line 3210, in _maybe_define_function
    return self._define_function_with_shape_relaxation(args, kwargs)

  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\eager\function.py", line 3142, in _define_function_with_shape_relaxation
    args, kwargs, override_flat_arg_shapes=relaxed_arg_shapes)

  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\eager\function.py", line 3075, in _create_graph_function
    capture_by_value=self._capture_by_value),

  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\framework\func_graph.py", line 986, in func_graph_from_py_func
    func_outputs = python_func(*func_args, **func_kwargs)

  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\eager\def_function.py", line 600, in wrapped_fn
    return weak_wrapped_fn().__wrapped__(*args, **kwds)

  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\framework\func_graph.py", line 973, in wrapper
    raise e.ag_error_metadata.to_exception(e)

ValueError: in user code:

    C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\keras\engine\training.py:806 train_function  *
        return step_function(self, iterator)
    C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\keras\engine\training.py:796 step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\distribute\distribute_lib.py:1211 run
        return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
    C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\distribute\distribute_lib.py:2585 call_for_each_replica
        return self._call_for_each_replica(fn, args, kwargs)
    C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\distribute\distribute_lib.py:2945 _call_for_each_replica
        return fn(*args, **kwargs)
    C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\keras\engine\training.py:789 run_step  **
        outputs = model.train_step(data)
    C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\keras\engine\training.py:749 train_step
        y, y_pred, sample_weight, regularization_losses=self.losses)
    C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\keras\engine\compile_utils.py:204 __call__
        loss_value = loss_obj(y_t, y_p, sample_weight=sw)
    C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\keras\losses.py:149 __call__
        losses = ag_call(y_true, y_pred)
    C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\keras\losses.py:253 call  **
        return ag_fn(y_true, y_pred, **self._fn_kwargs)
    C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\util\dispatch.py:201 wrapper
        return target(*args, **kwargs)
    C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\keras\losses.py:1535 categorical_crossentropy
        return K.categorical_crossentropy(y_true, y_pred, from_logits=from_logits)
    C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\util\dispatch.py:201 wrapper
        return target(*args, **kwargs)
    C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\keras\backend.py:4687 categorical_crossentropy
        target.shape.assert_is_compatible_with(output.shape)
    C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\framework\tensor_shape.py:1134 assert_is_compatible_with
        raise ValueError("Shapes %s and %s are incompatible" % (self, other))

    ValueError: Shapes (3, 1) and (1, 3) are incompatible

What am I doing wrong, and how can I fix it?

score 1 · Answer 1 · answered Aug 21 '20 at 07:05

The issue was with the shape of the y label. I fixed it by reshaping it using tf.reshape(y,[1,3]) The only change i made was in the pre_process function.

def pre_process(x,y):
     x_norm = (x - mean_Rot_MIP) / Var_Rot_MIP
     # Stacking along the last dimension to avoid having to move channel axis
     x_norm_3ch = tf.stack((x_norm, x_norm, x_norm), axis=-1)
     x_norm_3ch = tf.reshape(x_norm_3ch, [1,224,224,3])
     y_s = tf.reshape(y,[1,3])
     return x_norm_3ch , y_s

I'm sure there may be other ways to accomplish my objective, this one worked with minimal alterations.

Nicolas Gervais · Answer 2 · 2020-08-19T14:41:14.023

I think using zip() is a weird way to go about this. Why don't you use from_tensor_slices and batch:

data_set_train = tf.data.Dataset.from_tensor_slices((X_train , y_train)).batch(4)

This should work. Your output label shape should be (4, 3).

Corrected example:

import tensorflow as tf

x = tf.random.uniform(minval=0, maxval=1, shape=(100, 224, 224, 3), dtype=tf.float32)
y = tf.random.uniform(minval=0, maxval=3, shape=(100,), dtype=tf.int32)

y = tf.keras.utils.to_categorical(y, num_classes=3)

BS = 16
ds = tf.data.Dataset.from_tensor_slices((x, y)).batch(BS)

baseModel = tf.keras.applications.ResNet50(weights=None, include_top=False,
                                  input_tensor=tf.keras.Input(shape=(224, 224, 3)))
headModel = baseModel.output
headModel = tf.keras.layers.AveragePooling2D(pool_size=(7, 7))(headModel)
headModel = tf.keras.layers.Flatten(name="flatten")(headModel)
headModel = tf.keras.layers.Dense(256, activation="relu")(headModel)
headModel = tf.keras.layers.Dropout(0.5)(headModel)
headModel = tf.keras.layers.Dense(3, activation="softmax")(headModel)
model = tf.keras.Model(inputs=baseModel.input, outputs=headModel)

INIT_LR = 1e-4
NUM_EPOCHS = 1

opt = tf.keras.optimizers.Adam(lr=INIT_LR, decay=INIT_LR / NUM_EPOCHS)
model.compile(loss="categorical_crossentropy", optimizer=opt,metrics=["accuracy"])

H = model.fit(ds, epochs = NUM_EPOCHS, shuffle=False)

Why are we adding the 'batch' ? I ended up using zip because i have to pass both x and y as a tuple. In fact one of my previous questions led me to use zip https://stackoverflow.com/questions/63456492/data-api-valueerror-y-argument-is-not-supported-when-using-dataset-as-input — Divya Choudhary, Aug 19 '20 at 14:20
Also, I tried the solution you mentioned, but now I get the following error: ```ValueError: Unbatching a dataset is only supported for rank >= 1``` — Divya Choudhary, Aug 19 '20 at 14:23
There is really no reason to use `zip` here. The answer you're referring to isn't necessarily right. You also added a batch dimension, and it's needed. So in my answer I made a corrected example you can work from. Only thing I didn't implement is the preprocessing which I don't understand. — Nicolas Gervais, Aug 19 '20 at 14:43

Multiclass classification using Categorical cross entropy - ValueError: Shapes (3, 1) and (1, 3) are incompatible

2 Answers2