what I was trying to do is convert quantized tflite to ONNX and run it directly with ORT with tf2onnx. The network I'm trying to convert is YOLOv5 pre-trained with my own dataset + COCO and The model has been converted successfully. I'm sure of that because I just tried to load both of ONNX model (via PRT) and tflite model, save up the quantization parameters from the input and output details of the tflite interpreter, actually run the session of the ONNX model with those quantization parameters and it actually shows the identical result with running the tflite model/original pt model that I trained with Ultratycs API.
The problem is, in case of tflite, one could find the quantization parameters such as scale factor and zero points out of input detail or output detail, but I have no idea how to find those exact numbers from the converted ONNX model.
In case of ORT, get_inputs() and get_outputs() only shows you the name of the input/output tensor, type of the them and the dimension for them. In case of ONNX API, I haven't dug it that deep.
Any insight regarding the issue would be appreciated. Thank you in advance and have a nice day.