0

I am using Detectron2 (Mask-RCNN Model) and passed by:

_C.INPUT.MIN_SIZE_TEST = (800, 832, 864, 896)
_C.INPUT.MAX_SIZE_TEST = 1333

How is it possible to have different input image sizes? How are they entered into the model and Shouldn't the model have a consistent input size?

I tried to check the documentation but didnt find a clear answer.

  • Please clarify your specific problem or provide additional details to highlight exactly what you need. As it's currently written, it's hard to tell exactly what you're asking. – Community Nov 11 '22 at 13:17

1 Answers1

0

With a given kernel size and stride, Convolutional layers can process any input size and return the feature map with corresponding output dimensions.

The subsequent FCs do require a fixed input vector. This is where the Mask-RCNN uses RoI (Region of Interest) align that converts the region proposal to a fixed size for subsequent processing by the network. It has the same goals as RoI pool in a Fast-RCNN model.

Hope this explains why the input size need not be fixed.

Niranjan Ramesh
  • 166
  • 2
  • 5