Why am I getting different results when I use models with the same weights in different formats - \(.pt) \.onnx \(.bin, .xml)?

Question

I have a model trained on YOLOv5s and is working fine.

This is an input image:

I can get an expected result using pytorch after doing an inference:

This is an output image:

The thing is, I need it in Openvino and regardless if I do the inference using the model in .onnx or .bin and .xml (for openvino) I won't get the expected inference result.

What I get is a vector with this shape (1, 25200, 6). I know that:

25200 is equal to 1x3x80x80 + 1x3x40x40 + 1x3x20x20;
6 = 1 class + 4 (x,y,w,h) + 1 (score);
batch_size = 1

To export it, I used:

!python export.py --data models/custom_yolov5s.yaml --weights /content/bucket_11_03_2022.pt --batch-size 1 --device cpu --include openvino --imgsz 640

and to reproduce the issue I did in two ways:

.onnx:

import cv2
image = cv2.imread('data/cropped.png')

# Resize image to meet network expected input sizes
resized_image = cv2.resize(image, (640, 640))

# Reshape to network input shape
input_image = np.expand_dims(resized_image.transpose(2, 0, 1), 0)

import onnxruntime as onnxrt

onnx_session= onnxrt.InferenceSession("models/bucket_11_03_2022.onnx")
onnx_inputs= {onnx_session.get_inputs()[0].name:input_image.astype(np.float32)}
onnx_output = onnx_session.run(None, onnx_inputs)
img_label = onnx_output[0]
print(onnx_output[0].shape)

Openvino:

import cv2
import matplotlib.pyplot as plt
import numpy as np
from openvino.inference_engine import IECore

ie = IECore()

net = ie.read_network(
    model="bucket_11_03_2022.xml",
    weights="bucket_11_03_2022.bin",
)
exec_net = ie.load_network(net, "CPU")

output_layer_ir = next(iter(exec_net.outputs))
input_layer_ir = next(iter(exec_net.input_info))

# Text detection models expects image in BGR format
image = cv2.imread("data/cropped.png")

# N,C,H,W = batch size, number of channels, height, width
N, C, H, W = net.input_info[input_layer_ir].tensor_desc.dims

# Resize image to meet network expected input sizes
resized_image = cv2.resize(image, (W, H))

# Reshape to network input shape
input_image = np.expand_dims(resized_image.transpose(2, 0, 1), 0)

plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB));

result = exec_net.infer(inputs={input_layer_ir: input_image})

result['output'].shape

Could you guys help me to get the correct inference (bounding box with score) using .onnx or the IE format (openvino - .bin, .xml)?

The model files are here.

Please share your PyTorch yolov5s model, input image, and steps to reproduce the issue. — Rommel_Intel, Mar 15 '22 at 03:24
You can put all the files in Google Drive and share the link here. — Rommel_Intel, Mar 15 '22 at 10:42
Please include bucket_11_03_2022.bin and custom_yolov5s.yaml files in your Google Drive. — Rommel_Intel, Mar 16 '22 at 03:34

score 2 · Answer 1 · answered Mar 21 '22 at 09:54

Based on my replication, this issue occurred due to incorrect conversion from PyTorch to ONNX. I’ve found that the converted ONNX from the PyTorch model was able to detect the object (bucket) but did not reflect the correct label as it took one of the class names from coco128.yaml.

You may need to retrain your model by following the Train Custom Data. But I cannot guarantee this method will be successful as it is not validated by OpenVINO.

I suggest you post this issue in ultralytics GitHub forum. For your information, ultralytics is not a part of OpenVINO Toolkit.

This is actually not the problem. If you use the "--data./data.yaml" flag in the command to perform an inference, you can get the appropriate label in the inference results. I did that myself and I got it right. I just added the file to the drive folder. — Alexandre Tavares, Apr 25 '22 at 07:16

Why am I getting different results when I use models with the same weights in different formats - \(.pt) \.onnx \(.bin, .xml)?

1 Answers1