How is it possible to have different input image sizes in Detectron2?

Question

I am using Detectron2 (Mask-RCNN Model) and passed by:

_C.INPUT.MIN_SIZE_TEST = (800, 832, 864, 896)
_C.INPUT.MAX_SIZE_TEST = 1333

How is it possible to have different input image sizes? How are they entered into the model and Shouldn't the model have a consistent input size?

I tried to check the documentation but didnt find a clear answer.

Please clarify your specific problem or provide additional details to highlight exactly what you need. As it's currently written, it's hard to tell exactly what you're asking. — Community, Nov 11 '22 at 13:17

score 0 · Answer 1 · answered Nov 17 '22 at 05:58

With a given kernel size and stride, Convolutional layers can process any input size and return the feature map with corresponding output dimensions.

The subsequent FCs do require a fixed input vector. This is where the Mask-RCNN uses RoI (Region of Interest) align that converts the region proposal to a fixed size for subsequent processing by the network. It has the same goals as RoI pool in a Fast-RCNN model.

Hope this explains why the input size need not be fixed.

How is it possible to have different input image sizes in Detectron2?

1 Answers1