I got the pretrained FASTERRCNN_RESNET50_FPN model from pytorch (torchvision), here's the link.
Now I want to compute the model's complexity (number of parameters and FLOPs) as reported from torchvsion: enter image description here
How to do this?
Normally with the classification model (e.g. resnet50), we can use tools such as thop
or ptflop
. But the main concern is: What is the correct input image size (width & height, channel=3 for sure)?
From my reading, FasterCNN accepts unfixed input image size, but I've not found the step where the image is resized during forward.
Personally, I think the image will be passed to the backbone firstly (which is resnet50), so I chose input image size = (224,224) (same as imagenet's). But when trying this with ptflop
, the output FLOPs is very unstable.
Any recommendation is appreciated! Thanks in advance
I tried with ptflops
.
I expect a reasonable answer on the correct input image size.