I'm working with an ONNX model that I need to quantize in order to reduce its size, for that I'm following the instructions on the official documentation:
import onnx
from onnxruntime.quantization import quantize_dynamic, QuantType
model_fp32 = 'path/to/the/model.onnx'
model_quant = 'path/to/the/model.quant.onnx'
quantized_model = quantize_dynamic(model_fp32, model_quant, weight_type=QuantType.QUInt8)
But when I run that I get the following warning:
WARNING:root:The original model opset version is 9, which does not support quantization. Please update the model to opset >= 11. Updating the model automatically to opset 11. Please verify the quantized model.
I tested the quantized model and did not work, it generates this error:
INVALID_GRAPH : Load model from model_a2_quant.onnx failed:This is an invalid model. Error in Node:Upsample__477 : Op registered for Upsample is deprecated in domain_version of 11
What alternative do I have at this point to quantize the model?
I got the original model, in tensor flow, from this repo: https://github.com/ciber-lab/pictor-ppe
And convert it to ONNX with this code:
# input and output
input_tensor = Input( shape=(input_shape[0], input_shape[1], 3) ) # input
num_out_filters = ( num_anchors//3 ) * ( 5 + num_classes ) # output
## Build and load the model
model = yolo_body(input_tensor, num_out_filters)
weight_path = 'ONNX_demo/models/pictor-ppe-v302-a1-yolo-v3-weights.h5'
model.load_weights( weight_path )
tf.saved_model.save(model, "ONNX_demo/models/save_model")
# convert it to ONNX format:
python3 -m tf2onnx.convert --saved-model "ONNX_demo/models/save_model" --output "ONNX_demo/models/model.onnx"