2

I trained a YOLOv7 model on a custom dataset and converted it to ONNX. The input of the model on Netron reads "Float32(1,3,640,640)" which I understand. The output, however, is unclear to me as other tutorials mentioned there should be 6 elements representing bounding box position and size (xywh objectness and class number) but this model outputs 7 elements with an extra 0 as follows:

Float32(concatoutput_dim_0,7)

The output (sample data):

0, 24.744838, 50, 24.744838, 70.46938, 1 ,1, 0, 40.495939, 30.95939, 40.495939, 123.2848439, 1, 1, ... (Each 7 values start with zero, and the second and fourth are almost always equal). Does the 0 mean (dimension 0)?

Successfully converted yolov7 to onnx, and pre-processed input image. Output is unclear as to why it has a 7th element.

Loay Altal
  • 21
  • 3
  • Hi, I am facing similar issues like you, trying to understand the output of the model. Have you got correct results? I am facing random bounding boxes when using ONNX runtime and the ONNX exported model, whereas I get correct results using the Python `detect.py` file and the weights in `.pt`. – juan carlos Jan 26 '23 at 11:03
  • 1
    @juancarlos ONNX output of yolov7 is [0, max X, max Y, min X, min Y, conf, class] where 0 is the image number in batch (if one image, all boxes will have this zero which u can simply remove) Also, make sure to preprocess the image correctly for ONNX model to work properly. – Loay Altal Jan 29 '23 at 20:04
  • For what I understand from the output, (min X, min Y) would be the upper left corner of the bounding box, and (max X - min X, max Y - min Y) the Width and height, sure? – juan carlos Jan 30 '23 at 10:26
  • Just solved my issue, but in my case, my output is not as you stated, my output is [0, min X, min Y, max X, max Y, conf, class] – juan carlos Jan 30 '23 at 11:11
  • Yes, apologies for switching values - and yes, these are coordinates of a bbox as you mentioned. Happy coding! – Loay Altal Jan 30 '23 at 13:19

0 Answers0