0

I am trying to train a model for autonomous driving that converts input from the front camera, to a bird's eye view image.

The input and output, both are segmentation masks with shape (96, 144) where each pixel has a range from 0 to 12 (each number represents a different class).

Now my question is how should i preprocess my data and which loss function should i use for the model (I am trying to use a Fully convolutional Network).

I tried to convert input and outputs to shape (96, 144, 13) using keras' to_categorical utility so each channel has 0s and 1s of representing a specific mask of a category. I used binary_crossentropy ad sigmoid activation for last layer with this and the model seemed to learn and loss started reducing.

But i am still unsure if this is the correct way or if there are any better ways.

what should be the:

  • input and ouptput data format
  • activation of last layer
  • loss function
user52610
  • 11
  • 1
  • 4

1 Answers1

1

I found the solution, use categorical crossentropy with softmax activation at last layer. Use the same data format as specified in the question.

user52610
  • 11
  • 1
  • 4