2

I am working with a faster-rcnn type of system where automated focal loss was recently implemented from https://arxiv.org/pdf/1904.09048.pdf

In the above-linked paper in section 3.4. Regression it states

We assume that the labels are distributed around the actual correct ground truth by a Gaussian distribution with a variance of σ^2.

and

However, to correctly compute the cumulative distribution function the variance σ^2 of the task needs to be estimated. [...] training the variable σ^2 like a weight of the network.

I do not have data for the task variance σ^2.

I do not fully understand how it can be learned without having data for it.

Should I simply make the variable trainable and assume that the optimize knows what to do?

Alex
  • 21
  • 1
  • 2

0 Answers0