3

I try to convert my PyTorch object detection model (Faster R-CNN) to ONNX. I have two setups. The first one is working correctly but I want to use the second one for deployment reasons. The difference lies in the example image which I use for the export of the function torch.onnx.export().

In the first setup I use a real image as input for the ONNX export. But in a official tutorial they say that I can use a dummy input, which should have the same size as the model expects the input. So I created a tensor with the same shape but with random values. The export in both setups is working correctly. But the second setup does not deliver the desired results after inference with the ONNX runtime. The code and the exemplary output can be found below.

Setup 1

model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained = True)
...
checkpoint = torch.load(model_state_dict_path)
model.load_state_dict(checkpoint['model_state_dict'])
model.eval()

to_tensor = transforms.ToTensor()
img_rgb = Image.open(image_path_model).convert('RGB')
img_rgb = to_tensor(img_rgb)
img_rgb.unsqueeze_(0)    

torch.onnx.export(model, img_rgb, "detection.onnx", opset_version=11) 

I get no error and the export works. Afterwards I run the model with the ONNX runtime and I get the following output:

[array([[704.0696  , 535.19556 , 944.8986  , 786.1619  ],
         ...], dtype=float32),
array([2, 2, 2, 2, 2, 1, 1], dtype=int64),
array([0.9994363 , 0.9984769 , 0.99816966, ...], dtype=float32)]

The output is as I expect it to be (Bounding boxes, object classes and probabilities).

Setup 2

model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained = True)
...
checkpoint = torch.load(model_state_dict_path)
model.load_state_dict(checkpoint['model_state_dict'])
model.eval()

img_rgb = torch.randn(1, 3, 1024, 1024)   

torch.onnx.export(model, img_rgb, "detection.onnx", opset_version=11) 

Like in setup 1 I get no error and the export works. Afterwards I run the model with the ONNX runtime and with the same image as in setup 1 and I get the following output:

[array([], shape=(0, 4), dtype=float32),
array([], dtype=int64),
array([], dtype=float32)]

It is just an empty array.

What is wrong with the second setup? I am new to ONNX. The export runs the model. Do I have to provide an input on which the model also recognizes objects and therefore the dummy input with random values does not work? Is the statement "The values in this can be random as long as it is the right type and size." only valid for the provided tutorial?

Tom
  • 91
  • 7

1 Answers1

0

In the 2nd setup, you have a random tensor, so no bounding box with a high enough detection score was selected. Check that you have an image with detectable objects for the input.

I assume that the phrase about random inputs is correct in most cases (classification, segmentation, etc.), but detection models are using NonMaxSuppression and suppressing detections with low scores.

vuvko
  • 1
  • But what does ONNX do for the export? I just use the random input for the function torch.onnx.export(). After the export I run my ONNX model with the ONNX runtime and a real image and I get the empty output. So what does the export do with the model internally? – Tom Jul 13 '20 at 11:45
  • Sorry, I haven't noticed you ran with the same image. I'm not sure with your case, but if you run your first setup on a different image than you used for export, will the output change? – vuvko Jul 14 '20 at 20:38