I'm trying to understand how to calculate the value of circled terms, what are the inputs/outputs I have to compare?
Let's take the first term for example, if I understand it correctly it goes something like this:
Let's say my predicted values for the first cell, first bounding box (yolov1 = 2 bbs) are
[pr, x, y, W, H]
[.7, 0.5, 0.3, 0.1, 0.1]
and my true values are
[pr, x, y, W, H]
[1, 0.6, 0.4, 0.2, 0.2]
That would make mean the formula goes something like this
5 * (1 or 0) ((0.6 - .5)^2 + (0.4 - 0.3)^2)
Can someone provide an example step by step on what are the metrics to determine 1 or 0?
Are we looking at the label from the training set image? Are we looking at the predicted objectness score? IoU?
According to the YOLO paper:
1objij : denotes that the jth bounding box predictor in cell i is “responsible” for that prediction
1obji : Denotes if object appears in cell i
But this quote doesn't exactly help me answer my question... Any help would be appreciated.