Different results between quantized TFlite model to its implementation using Numpy

Question

I am working with Tensorflow/Keras and want to quantize model parameters and then implement the model with Numpy. I've build 1D CNN model ,train it, then quantize its parameters , to UINT8 ,using Tensorflow post training quantization , then i've extract the weights and biases and export it to .npy file. After build the same 1D CNN using Numpy (dtype UINT8) , with the extracted weights and biases , i check the results layer by layer and got different results compare to the quntized model results. when i compare the results of my Numpy implementation for Floating point model ( without quantization to UINT8) i do get the same outputs as Keras model outputs.( so i guess my Numpy model is working well :) ).

As far as i understood , interpreter.get_input_details() include the quantization scale and zero point parameters of the input tensor which required in case i want to convert the UINT8 weights to float - am i right?

I will bevery happy to suggestion how to get the same results as quantized Keras model

Can you share the exact conversion options you used, and the code you used to compare the inference results? — yyoon, Oct 21 '20 at 07:21
I will do it soon , do you know how TF-lite add layers to the unauantized model when perform post traning quantization? — ItamarE, Oct 22 '20 at 10:30

Different results between quantized TFlite model to its implementation using Numpy

0 Answers0