Fully convolutional neural network for semantic segmentation

Question

I have perhaps a naive question and sorry if this is not the appropriate channel to ask about these kind of questions. I have successfully implemented a FCNN for semantic segmentation, but I don't involve deconvolution or unpooling layers.

What I simply do, is to resize the ground truth image to the size of my final FCNN layer and then I compute my loss. In this way, I obtain a smaller image as output, but correctly segmented.

Is the process of deconvolution or unpooling needed at all?

I mean, resizing images in python is quite easy, so why one should involve complicated techniques as deconv or unpooling to do the same? Surely I miss something.

What's the advantage in enlarging images using unpooling and performing deconv?

score 0 · Answer 1 · answered Apr 25 '18 at 15:14

0

The output of your network after the convolution steps is smaller than your original image: you probably don't want that, you want to have semantic segmentation for the image you give it as input.

If you simply resize it to its original size, new pixels will be interpolated and therefore lack precision. Deconvolution layers allow to learn this resize (as they're learned during training, through backpropagation), and therefore to increase your segmentation precision.

answered Apr 25 '18 at 15:14

MeanStreet

1,217
1
15
33

I see. I am currently following this technique. I don't really feel the need to have more precision than the usual interpolation in my case. Thank you very much for the comment. – karakorum Apr 27 '18 at 06:07

Fully convolutional neural network for semantic segmentation

1 Answers1