Activation function in segmentation network and metrics computation

Question

I had a binary segmentation task: I had to predict yes or no for each pixel of an image.

Therefore I used a binary cross entropy loss (which is defined in Pytorch and combines a sigmoid and a cross entropy loss) to train the network.

To compute the metrics, since I needed an output of 0 and 1 for each pixel, I used the sigmoid function and then consider everything smaller than 0.5 as 0 and everything bigger than 0.5 as 1.

However I think this approach is not correct and I should have used something like a softmax. Could you explain what approach I should have followed and why?

Your method sounds good, just select your threshold to be the threshold which gives you the best score on some validation data. 0.5 might be a very bad threshold. — jhso, Mar 03 '22 at 23:00
I did not think about that! Could you explain why I should add the threshold as an hyper-parameter? Intuitively 0.5 seems to me the most rational one since it is as far from 1 as 0, however I could not justify it mathematically. — emanuele_f, Mar 03 '22 at 23:33
0.5 is usually a good option for when your classes are balanced. However, you can plot the distribution of your class 0/1 predicted probabilities and you will often see that the best point (high end of class 0 and low end of class 1) is not at 0.5. When you have class imbalance this will shift towards the dominant class. — jhso, Mar 03 '22 at 23:42
I used a weighted binary cross entropy though, in this case would it be the same? Sorry for not having correctly described my approach. — emanuele_f, Mar 03 '22 at 23:49
That should still be fine, just give it a try. If you want to see a general report on how your model is doing you should use the area under the roc curve (look up sklearn's implementation). This will give you a non-thresholded report of accuracy. — jhso, Mar 04 '22 at 00:13
for binary classification, sigmoid + CE is equivalent to two outputs + softmax + CE. Do rhe math - it's a good exercise — Shai, Mar 06 '22 at 20:02

Activation function in segmentation network and metrics computation

0 Answers0