The TFGPT2LMHeadModel convertion to TFlite renders unexpected input and output shape as oppoed to the pre trained model gpt2-64.tflite , how can we fix the same ?
!wget https://s3.amazonaws.com/models.huggingface.co/bert/gpt2-64.tflite
import numpy as np
import tensorflow as tf
tflite_model_path = 'gpt2-64.tflite'
# Load the TFLite model and allocate tensors
interpreter = tf.lite.Interpreter(model_path=tflite_model_path)
interpreter.allocate_tensors()
# Get input and output tensors
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
input_shape = input_details[0]['shape']
#print the output
input_data = np.array(np.random.random_sample((input_shape)), dtype=np.int32)
interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()
output_data = interpreter.get_tensor(output_details[0]['index'])
print(output_data.shape)
print(input_shape)
Gives output as
>(1, 64, 50257)
> [ 1 64]
which is as expected
but when we try to convert TFGPT2LMHeadModel to TFLITE , we get different output as below
import tensorflow as tf
from transformers import TFGPT2LMHeadModel
import numpy as np
model = TFGPT2LMHeadModel.from_pretrained('gpt2') # or 'distilgpt2'
input_spec = tf.TensorSpec([1, 64], tf.int32)
model._set_inputs(input_spec, training=False)
converter = tf.lite.TFLiteConverter.from_keras_model(model)
# For FP16 quantization:
# converter.optimizations = [tf.lite.Optimize.DEFAULT]
# converter.target_spec.supported_types = [tf.float16]
tflite_model = converter.convert()
open("gpt2-64-2.tflite", "wb").write(tflite_model)
tflite_model_path = 'gpt2-64-2.tflite'
# Load the TFLite model and allocate tensors
interpreter = tf.lite.Interpreter(model_path=tflite_model_path)
interpreter.allocate_tensors()
# Get input and output tensors
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
input_shape = input_details[0]['shape']
#print the output
input_data = np.array(np.random.random_sample((input_shape)), dtype=np.int32)
interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()
output_data = interpreter.get_tensor(output_details[0]['index'])
print(output_data.shape)
print(input_shape)
Output:
>(2, 1, 12, 1, 64)
>[1 1]
How can we fix the same ?