How do you find the quantization parameter inside of the ONNX model resulted in converting already quantized tflite model to ONNX?

Question

what I was trying to do is convert quantized tflite to ONNX and run it directly with ORT with tf2onnx. The network I'm trying to convert is YOLOv5 pre-trained with my own dataset + COCO and The model has been converted successfully. I'm sure of that because I just tried to load both of ONNX model (via PRT) and tflite model, save up the quantization parameters from the input and output details of the tflite interpreter, actually run the session of the ONNX model with those quantization parameters and it actually shows the identical result with running the tflite model/original pt model that I trained with Ultratycs API.

The problem is, in case of tflite, one could find the quantization parameters such as scale factor and zero points out of input detail or output detail, but I have no idea how to find those exact numbers from the converted ONNX model.

In case of ORT, get_inputs() and get_outputs() only shows you the name of the input/output tensor, type of the them and the dimension for them. In case of ONNX API, I haven't dug it that deep.

Any insight regarding the issue would be appreciated. Thank you in advance and have a nice day.

PesozuSejin · Accepted Answer · 2022-10-30T23:21:45.553

So, I found the way and decided to post my own answer for the people who might be curious about the solution.

What I did is import onnx, load up the model and from the graph member of that said model, tried to reference the very first and the very last nodes among that graph. Inside those nodes, tried to extract the name of the 2nd and 3rd elements from the ‘input’ tab because It is given that the first one would be the actual input tensor.

And then, with initializers of that graph, tried to find the ones with the same name and once I found them, convert them into the actual value with onnx.numpy_helper.to_array() here is the code snippet

  factors = onnx_model.graph.initializer
  for idx, init in enumerate(factors):
    if factors[idx].name == nodes[0].input[1]:
      array_test = onnx.numpy_helper.to_array(factors[idx])
      self.scale_input = array_test
    elif factors[idx].name == nodes[0].input[2]:
      array_test = onnx.numpy_helper.to_array(factors[idx])
      self.zero_point_input = array_test
    elif factors[idx].name == nodes[len(nodes) - 1].input[1]:  
      array_test = onnx.numpy_helper.to_array(factors[idx])
      self.scale_output = array_test
    elif factors[idx].name == nodes[len(nodes) - 1].input[2]:
      array_test = onnx.numpy_helper.to_array(factors[idx])
      self.zero_point_output = array_test            
  print(self.scale_input, self.scale_output)  
  print(self.zero_point_input, self.zero_point_output)  
  del onnx_model

The problem is the actual operating environment is embedded ECU that does not support onnx for some reason so, I tried to find the solution available with onnxruntime, but for now, this is the everything I got.

How do you find the quantization parameter inside of the ONNX model resulted in converting already quantized tflite model to ONNX?

1 Answers1