How does post-training quantization work?

Asked Feb 22 '23 at 07:43

Active Feb 22 '23 at 07:43

Viewed 43 times

I am currently interested in model quantization, especially that of post-training quantization of neural network models.

I simply want to convert the existing model (i.e., TensorFlow model) using float32 weights into a quantized model with float16 weights.

Simply changing each data type of every weight (i.e., float32 -> float16) would not work, would it?

Then, how can I manually perform this type of quantization?

Suppose i have weight matrix that look like; [ 0.02894312 -1.8398855 -0.28658497 -1.1626594 -2.3152962 ]

Then simply converting every weight in the matrix into float16 dtype would generate the following matrix; [ 0.02895 -1.84 -0.2866 -1.163 -2.314 ]

which, i think will not be the way how the off-the-shelf libraries' (e.g., TF Lite) quantization work...

Then how can i manually perform this type of quantization properly?

asked Feb 22 '23 at 07:43

Gyeongho Kim

0 Answers0