I wish to quantize the weights and biases of an existing Neural Network model. As per my understanding, the fixed-point representation ensures a fixed bit-width of the weights, biases and activations, with pre-determined fixed number of integer and fraction bits.
Essentially I want to perform Post Training Quantization. I checked out this article https://www.tensorflow.org/model_optimization/guide/quantization/post_training .
However I couldn't find any support for what I want to do i.e. be able to specify the number of the integer and fraction bits within the fixed-point representation scheme for the weights, biases and activations.
I did find the QKeras library which seemed to support this functionality. However, it does not seem to have an built-in quantized sigmoid layer.
Any pointers or library/article recommendations that could aid me in doing what I want to do, would be immensely helpful and greatly appreciated.