I have a model trained on YOLOv5s and is working fine.
I can get an expected result using pytorch after doing an inference:
This is an output image:
The thing is, I need it in Openvino and regardless if I do the inference using the model in .onnx or .bin and .xml (for openvino) I won't get the expected inference result.
What I get is a vector with this shape (1, 25200, 6). I know that:
- 25200 is equal to 1x3x80x80 + 1x3x40x40 + 1x3x20x20;
- 6 = 1 class + 4 (x,y,w,h) + 1 (score);
- batch_size = 1
To export it, I used:
!python export.py --data models/custom_yolov5s.yaml --weights /content/bucket_11_03_2022.pt --batch-size 1 --device cpu --include openvino --imgsz 640
and to reproduce the issue I did in two ways:
- .onnx:
import cv2
image = cv2.imread('data/cropped.png')
# Resize image to meet network expected input sizes
resized_image = cv2.resize(image, (640, 640))
# Reshape to network input shape
input_image = np.expand_dims(resized_image.transpose(2, 0, 1), 0)
import onnxruntime as onnxrt
onnx_session= onnxrt.InferenceSession("models/bucket_11_03_2022.onnx")
onnx_inputs= {onnx_session.get_inputs()[0].name:input_image.astype(np.float32)}
onnx_output = onnx_session.run(None, onnx_inputs)
img_label = onnx_output[0]
print(onnx_output[0].shape)
- Openvino:
import cv2
import matplotlib.pyplot as plt
import numpy as np
from openvino.inference_engine import IECore
ie = IECore()
net = ie.read_network(
model="bucket_11_03_2022.xml",
weights="bucket_11_03_2022.bin",
)
exec_net = ie.load_network(net, "CPU")
output_layer_ir = next(iter(exec_net.outputs))
input_layer_ir = next(iter(exec_net.input_info))
# Text detection models expects image in BGR format
image = cv2.imread("data/cropped.png")
# N,C,H,W = batch size, number of channels, height, width
N, C, H, W = net.input_info[input_layer_ir].tensor_desc.dims
# Resize image to meet network expected input sizes
resized_image = cv2.resize(image, (W, H))
# Reshape to network input shape
input_image = np.expand_dims(resized_image.transpose(2, 0, 1), 0)
plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB));
result = exec_net.infer(inputs={input_layer_ir: input_image})
result['output'].shape
Could you guys help me to get the correct inference (bounding box with score) using .onnx or the IE format (openvino - .bin, .xml)?
The model files are here.