post quantization int8 and prune my model after i trained it using ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8

Question

im trying to inference my model in Arduino 33BLE and to do so i trained my model using ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8 i got a model size with 6.5mb with 88%map@0.5IOU which is nice so i tried to quantize the model using int8 but the model size increased to 11.5mb and the accuracy was trash i dont know what happened if someone can help me it will be great

my code to quantize the model

import tensorflow as tf
import io
import PIL
import numpy as np
import tensorflow_datasets as tfds

def representative_dataset_gen():
    folder = "/content/dataset/train/images"
    image_size = 320
    raw_test_data = []

    files = glob.glob(folder+'/*.jpeg')
    for file in files:
        image = Image.open(file)
        image = image.convert("RGB")
        image = image.resize((image_size, image_size))
        #Quantizing the image between -1,1;
        image = (2.0 / 255.0) * np.uint8(image) - 1.0
        #image = np.asarray(image).astype(np.float32)
        image = image[np.newaxis,:,:,:]
        raw_test_data.append(image)

    for data in raw_test_data:
        yield [data]

converter = tf.lite.TFLiteConverter.from_saved_model('/content/gdrive/MyDrive/customTF2/data/tflite/saved_model')
converter.representative_dataset = representative_dataset_gen
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8  # or tf.uint8
converter.inference_output_type = tf.int8  # or tf.uint8
tflite_model = converter.convert()

with open('/mydrive/customTF2/data/tflite/saved_model/detect8.tflite', 'wb') as f:
  f.write(tflite_model)

also if there is away to prune the model it will also help with reducing the size to less than 1mb. i also tried yolov5 and pruned and quantize the model to 1.9mb but couldnt go further, and then tried to convert the tflite model to .h model to inference in esp32 (instead since my tflite model is larger than 1mb), but the model size also increased to 11mb.

i tried post training quantization for my model but the model size increased instead of decreasing, not only that,the model performance reduced drastically. for the pruning part i couldnt do using movbilenetV2 and i hope someone can help

post quantization int8 and prune my model after i trained it using ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8

0 Answers0