0

I will be thankful if you answer my question. I am worried I am doing wrong, because my network always gives black image without any segmentation. I am doing semantic segmentation in Caffe. The output of score layer is <1 5 256 256> batch_size no_classes image_width image_height. Which is sent to SoftmaxWithLoss layer, and the out input of loss layer is the groundtruth image with 5 class labels <1 1 256 256>.

My question is: the dimension of these two inputs of loss layer does not match. Should I create 5 label images for these 5 classes and send a batch_size of 5 in label layer into the loss layer?

How can I prepare label data for semantic segmentation?

Regards

S.EB
  • 1,966
  • 4
  • 29
  • 54

1 Answers1

0

your dimensions are okay. you are outputting 5 vector per pixel indicating the probability of each class. The ground truth is a single label (integer) and the loss encourages the probability of the correct label to be the maximal for the pixel

Shai
  • 111,146
  • 38
  • 238
  • 371
  • Thanks for your comment. So, this means there is no need to create 5 different label images that become two-class images and each separating a specific class from the background? Sorry I am a bit confused by looking at this [link](https://github.com/BVLC/caffe/issues/1341) – S.EB Mar 28 '17 at 05:33