How to do multiclass-classification without one-hot encoding, using the 'sparse_categorical_entropy' in keras?

Question

My data is pretty simple, input arrays of m=10 real numbers and output another array of m=10 numbers. I'm trying to sort arrays using Neural Networks, I send in the random numbers and I expect the NN to output the argsort(x), thus the array of integers with the positions of the ordered array.

(N,10) input

array([0.2843506 , 0.89343795, 0.37331381, 0.67697506, 0.85043472,
       0.77386477, 0.08902575, 0.94891316, 0.83865261, 0.93507237])

--->(N,10) output

array([6, 0, 2, 3, 5, 8, 4, 1, 9, 7])

I know standard practice would say to do one-hot encoding but I know you can skip it with sparse_categorical_entropy loss.

Here's my model:

def model_build():
  in_x = Input(shape=(m,))
  x = Dense(m,activation='relu')(in_x)
  x = Dense(m,activation='relu')(x)
  x = Dense(m,activation='relu')(x)
  x=Dense(m, activation='softmax')(x)

Builds just fine:

Model: "model_110"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_111 (InputLayer)       (None, 10)                0         
_________________________________________________________________
dense_424 (Dense)            (None, 10)                110       
_________________________________________________________________
dense_425 (Dense)            (None, 10)                110       
_________________________________________________________________
dense_426 (Dense)            (None, 10)                110       
_________________________________________________________________
dense_427 (Dense)            (None, 10)                110       
=================================================================
Total params: 440
Trainable params: 440
Non-trainable params: 0
_________________________________________________________________
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:9: UserWarning: Update your `Model` call to the Keras 2 API: `Model(inputs=Tensor("in..., outputs=Tensor("de...)`
  if __name__ == '__main__':


  model = Model(inputs=in_x, output=x)
  return model

It also compiles without problems

opt = Adam()
model.compile(optimizer=opt,
        loss='sparse_categorical_crossentropy',
        metrics=['sparse_categorical_accuracy'])

But doesn't fit:

H=model.fit(X_train,y_train, validation_data=(X_val,y_val),epochs=100)

It gives me shape error:

ValueError: Error when checking target: expected dense_427 to have shape (1,) but got array with shape (10,)

Here's the full error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-520-8559bb070d1e> in <module>()
      1 H=model.fit(X_train,y_train, validation_data=(X_val,y_val),
      2             #callbacks=callbacks_list,
----> 3             epochs=100)

2 frames
/usr/local/lib/python3.6/dist-packages/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_freq, max_queue_size, workers, use_multiprocessing, **kwargs)
   1152             sample_weight=sample_weight,
   1153             class_weight=class_weight,
-> 1154             batch_size=batch_size)
   1155 
   1156         # Prepare validation data.

/usr/local/lib/python3.6/dist-packages/keras/engine/training.py in _standardize_user_data(self, x, y, sample_weight, class_weight, check_array_lengths, batch_size)
    619                 feed_output_shapes,
    620                 check_batch_axis=False,  # Don't enforce the batch size.
--> 621                 exception_prefix='target')
    622 
    623             # Generate sample-wise weight values given the `sample_weight` and

/usr/local/lib/python3.6/dist-packages/keras/engine/training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix)
    143                             ': expected ' + names[i] + ' to have shape ' +
    144                             str(shape) + ' but got array with shape ' +
--> 145                             str(data_shape))
    146     return data
    147 

ValueError: Error when checking target: expected dense_427 to have shape (1,) but got array with shape (10,)

Your problem formulation is not consistent with the implementation of the model. The model has one softmax layer; therefore it would generate a **single** one-hot encoded vector of length 10. But according to your problem formulation (and the labels shape) you would need 10 one-hot encoded vectors of length 10. — today, Apr 13 '20 at 18:34
Hi, thank you for your answer, so how am I supposed to make it work? I was following this tutorial. https://www.dlology.com/blog/how-to-use-keras-sparse_categorical_crossentropy/ — 9879ypxkj, Apr 13 '20 at 18:43
As far as I know, this is a bit trickier than you think. Creating a neural network that learns how to sort a set of input numbers, although seems trivial, turns out to be much more difficult. There have been various papers and model which try to tackle this problem, but none of them are perfect. For example, based on your model, a very simple approach would be to have Dense layer with 10*10 units with no activation function, reshape it to `(10, 10)` and then apply softmax. So you would end up with 10 one-hot vectors; >>> — today, Apr 13 '20 at 18:53
>>> however, the problem is that these vectors may not be necessarily mutually exclusive, e.g. you might have two vectors pointing to third position. Another simple idea might be to use RNN to process the given input sequentially and generate similar one-hot encoded ordering (but it has the same flaws). Try search for "sorting numbers neural network" or "ranking model machine learning" to find various problem formulations, cost functions and architectures used to tackle this problem. — today, Apr 13 '20 at 18:56

How to do multiclass-classification without one-hot encoding, using the 'sparse_categorical_entropy' in keras?

0 Answers0