Do you guys know where or how to obtain the 0.47MB version of SqueezeNet ?
In other words, how to make the weights bitwidth to be 6 instead of 8 ?
I cannot find the modification spot in this SqueezeNet generation code.
Do you guys know where or how to obtain the 0.47MB version of SqueezeNet ?
In other words, how to make the weights bitwidth to be 6 instead of 8 ?
I cannot find the modification spot in this SqueezeNet generation code.
In this following method, I got 0.77 MB Model! Lets assume we have a SqueezeNet_model. We can convert SqueezeNet to Tensorflow Lite Model.
converter = tf.lite.TFLiteConverter.from_keras_model(SqueezeNet_model)
open("SqueezeNet_model.tflite", "wb").write(tflite_model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
tflite_quant_model = converter.convert()
Then, we can use POST quantization to decrease the size of model!
open("SqueezeNet_Quant_model.tflite", "wb").write(tflite_quant_model)
print("Quantized model in Mb:", os.path.getsize('SqueezeNet_Quant_model.tflite') / float(2**20)) // I got 0.77 MB model
Finally, we can test our model with:
# Load TFLite model and allocate tensors.
interpreter = tf.lite.Interpreter(model_path="SqueezeNet_Quant_model.tflite")
interpreter.allocate_tensors()
# Get input and output tensors.
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
# Test model on some input data.
input_shape = input_details[0]['shape']
acc=0
for i in range(len(x_test)):
input_data = np.array(x_test[i].reshape(input_shape), dtype=np.float32)
interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()
output_data = interpreter.get_tensor(output_details[0]['index'])
if(np.argmax(output_data) == np.argmax(y_test[i])):
acc+=1
acc = acc/len(x_test)
print(acc*100)