Run inference with quantized tflite model "INT8" in Python

Question

**Hello everyone, I converted a tensorflow float model to a tflite quantized INT8 model recently, in the end I got the model without errors. I want to do inferences with this model in python but I can't get good results. The code is as follows: **

Convert TF model

 def representative_dataset_gen():
    for i in range(20):
        data_x, data_y = validation_generator.next()
        for data_xx in data_x:
            data = tf.reshape(data, shape=[-1, 128, 128, 3])
            yield [data]

converter = tf.lite.TFLiteConverter.from_keras_model(model)

converter.optimizations = [tf.lite.Optimize.DEFAULT]

converter.representative_dataset = representative_dataset_gen

converter.target_spec.supported_ops =[tf.lite.OpsSet.TFLITE_BUILTINS_INT8]

converter.inference_input_type  = tf.int8

converter.inference_output_type = tf.int8

quantized_model = converter.convert()

open("/content/drive/My Drive/model.tflite", "wb").write(quantized_model)

Run inference

tflite_file='./model_google.tflite'
img_name='./img_test/1_2.jpg'

test_image = load_img(img_name, target_size=(128, 128))
test_image = img_to_array(test_image)

test_image = test_image.reshape(1, 128, 128,3)
#test_image = test_image.astype('float32')


interpreter = tf.lite.Interpreter(model_path=(tflite_file))
interpreter.allocate_tensors()

input_details = interpreter.get_input_details()[0]


input_scale, input_zero_point = input_details['quantization']


test_image_int = test_image / input_scale + input_zero_point
test_image_int=test_image_int.astype(input_details['dtype'])




interpreter.set_tensor(input_details['index'], test_image_int)
interpreter.invoke()

output_details = interpreter.get_output_details()[0]

output = interpreter.get_tensor(output_details['index'])

scale, zero_point= output_details['quantization']

tflite_output=output.astype(np.float32)
tflite_output= (tflite_output- zero_point)* scale

print(input_scale)
print(tflite_output)
print(input_details["quantization"])

Could you tell me how I can predict a class with this quantized model (Input and output are converted to INT8) and have the right probability values

If you can give enough representative data set, the quantization range can be correctly computed. If possible, please consider using the actual enough data to be provided at the representative_dataset_gen method. — Jae sung Chung, Mar 23 '21 at 22:34

score 0 · Answer 1 · answered Mar 24 '21 at 08:40

Hi Jae, Thank you for your answer, attached is the representative dataset code :

train_datagen =  ImageDataGenerator(
    rescale=1. / 255,
    rotation_range=30,
    width_shift_range=0.1,
    height_shift_range=0.1,
    shear_range=0.1,
    zoom_range=[0.6, 1.1],
    horizontal_flip=True,
    brightness_range=[0.8, 1.3],
    channel_shift_range=2.0,
    fill_mode='nearest')

train_generator = train_datagen.flow_from_directory(
    train_data_dir,
    target_size=(img_width, img_height),
    batch_size=batch_size,
    classes=classes,
    class_mode='categorical',
    )
def representative_dataset_gen():
    for i in range(10):
        data_x, data_y = train_generator.next()
        for data in data_x:
            data = tf.reshape(data, shape=[-1, 128, 128, 3])
            yield [data]

I used data from the training dataset for quantization, could you tell me how to do the image processing before sending it to the input and how to read the inference at the output Thank you

Quantization requires some experiments per model done by the model maker in order to make the quantized model equivalent outputs with the original float model. My recommendation is using enough data to be provided at the representative_dataset_gen method in ordrr to represent the statistics data of the input candidates. Providing ten data records may be not enough for representing the characteristics of the model. If you have difficulty on creating the same quality from thequantized model, you can consider using the original float model. — Jae sung Chung, Mar 24 '21 at 10:52

Run inference with quantized tflite model "INT8" in Python

1 Answers1