Quantizing the neural network weights and biases to int16 format

Question

I am trying to quantize the weights and biases of my neural network to a 16 bit integer format. The reason for this is to use these arrays in CCS to program the network on a MCU. While I followed the process for post-training quantization using TensorflowLite and also got the results for a conversion to the uint8 format, I am not sure how I can also achieve this for a 16-bit format. My code for the uint8 conversion was as follows:

def representative_data_gen():
    data = np.array(x_train, dtype = np.float32);
    for input_value in data:
        yield [input_value]

converter = tf.lite.TFLiteConverter.from_keras_model(model)

# Set the optimization mode 
converter.optimizations = [tf.lite.Optimize.DEFAULT]

# Pass representative dataset to the converter
converter.representative_dataset = representative_data_gen

# Restricting supported target op specification to INT8
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]

# Set the input and output tensors to uint8 
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8

# Convert and Save the model
tflite_model = converter.convert()
open("clap_model.tflite", "wb").write(tflite_model)

My x_train array contains float values in the float32 Format. As I read through the approaches available on the tensorflow lite page, I did see a case where they use a 16*8 approach but the weights still remain in a 8-bit format in that scenario.

If there is any other way to convert these floating point values or even the obtained 8-bit integers to a 16-bit integer format, that would also be extremely helpful. The only approach I can think of is a manual quantization from floating point to 16-bit integer approach but my guess is that would be a bit computationally tedious since I copy or use the weights and biases and then pass them through the said quantization function.

score 0 · Answer 1 · answered Jan 24 '22 at 07:07

TFLite doesn't support 16bit integer weight and bias output.

However, if you are programming your MCU, then you don't really need to worry about any extra cost -- the 16bit and 8bit weight schemes are identical so you can just use a cast to int16 the rest should work correctly.

If you only want to convert to store your weights/biases in int16 and handle the rest of your math by yourself, I recommend writing a script to rewrite the buffers on disk.

Flatbuffer_utils may be worth looking at for manipulating flatbuffers.

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/tools/flatbuffer_utils.py

Quantizing the neural network weights and biases to int16 format

1 Answers1