0

I am working with Tensorflow/Keras and want to quantize model parameters and then implement the model with Numpy. I've build 1D CNN model ,train it, then quantize its parameters , to UINT8 ,using Tensorflow post training quantization , then i've extract the weights and biases and export it to .npy file. After build the same 1D CNN using Numpy (dtype UINT8) , with the extracted weights and biases , i check the results layer by layer and got different results compare to the quntized model results. when i compare the results of my Numpy implementation for Floating point model ( without quantization to UINT8) i do get the same outputs as Keras model outputs.( so i guess my Numpy model is working well :) ).

As far as i understood , interpreter.get_input_details() include the quantization scale and zero point parameters of the input tensor which required in case i want to convert the UINT8 weights to float - am i right?

I will bevery happy to suggestion how to get the same results as quantized Keras model

ItamarE
  • 41
  • 4
  • Can you share the exact conversion options you used, and the code you used to compare the inference results? – yyoon Oct 21 '20 at 07:21
  • I will do it soon , do you know how TF-lite add layers to the unauantized model when perform post traning quantization? – ItamarE Oct 22 '20 at 10:30

0 Answers0