Here's the screenshot of a YouTube video implementing the Loss function from the YOLOv1 original research paper.
What I don't understand is the need for torch.Flatten()
while passing the input to self.mse()
, which, in fact, is nn.MSELoss()
The video just mentions the reason as nn.MSELoss()
expects the input in the shape (a,b), which I specifically don't understand how or why?
Video link just in case. [For reference, N is the batch size, S is the grid size (split size)]