I have read 2 top-ranking solutions in kaggle concerning multi-label image classification. In both of the competitions I read, random cropping was performed. To me, this seems like a bad move to make because we could have a mismatch between the labels and the cropped images. Here are the two links:
1.human-protein-atlas-image-classification
2.iMet Collection 2019 - FGVC6
If the reason for cropping is an input size image constraint for the used model architecture, then isn't it better to resize the image instead of cropping it?