0

I am reading YOLO original paper https://arxiv.org/pdf/1506.02640.pdf.

At the beginning of the paper. It says

If the center of an object falls into a grid cell, that grid cell is responsible for detecting that object.

The loss function

Note that the loss function only penalizes classification error if an object is present in that grid cell (hence the conditional class probability discussed earlier).

So, my understanding is that an object is present in one cell if the center of this object falls into this cell. Even if a part of an object (but not the center) is contained in one cell. We still think this cell doesn't have an object (1_i^obj = 0). And the target confidence score should be 0

Am I correct?

Ruoyu Guo
  • 42
  • 5

1 Answers1

1

I will try to answer the question (please correct me if I made any mistake). First of all, why the centre point is responsible for the bounding box information? 1). YOLO bounding box annotation is different from other approaches, which uses (x_centre, y_centre, width, height) instead of (x_min, y_min, x_max, y_max). (why? see below). 2). One loss item penalizes the centre, width, height difference based on the ground truth, and such item makes the centre grid are more likely to make the prediction own highest IOU by compared with others. Given that, in the training stage, only the highest IOU predicted bounding box is in use, so the centre grid almost always has the highest confidence.

Finally, back to your question, no. The no-centre grids' confidence scores are very likely to be lower than the centre point, but they are not 0 if there is an object is covered by the grid. Another way to explain this is, in the testing phase, there are amounts of bounding box be generated and the NMS is in use to pick the best.

Sun_Rider
  • 11
  • 1