TFGPT2LMHeadModel to TFLite changes the input and output shape

Question

The TFGPT2LMHeadModel convertion to TFlite renders unexpected input and output shape as oppoed to the pre trained model gpt2-64.tflite , how can we fix the same ?

!wget https://s3.amazonaws.com/models.huggingface.co/bert/gpt2-64.tflite

import numpy as np
import tensorflow as tf

tflite_model_path = 'gpt2-64.tflite'
# Load the TFLite model and allocate tensors
interpreter = tf.lite.Interpreter(model_path=tflite_model_path)
interpreter.allocate_tensors()

# Get input and output tensors
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

input_shape = input_details[0]['shape']

#print the output
input_data = np.array(np.random.random_sample((input_shape)), dtype=np.int32)
interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()
output_data = interpreter.get_tensor(output_details[0]['index'])
print(output_data.shape)
print(input_shape)

Gives output as

>(1, 64, 50257)

> [ 1 64]

which is as expected

but when we try to convert TFGPT2LMHeadModel to TFLITE , we get different output as below

import tensorflow as tf
from transformers import TFGPT2LMHeadModel
import numpy as np

model = TFGPT2LMHeadModel.from_pretrained('gpt2') # or 'distilgpt2'

input_spec = tf.TensorSpec([1, 64], tf.int32)
model._set_inputs(input_spec, training=False)

converter = tf.lite.TFLiteConverter.from_keras_model(model)

# For FP16 quantization:
# converter.optimizations = [tf.lite.Optimize.DEFAULT]
# converter.target_spec.supported_types = [tf.float16]

tflite_model = converter.convert()

open("gpt2-64-2.tflite", "wb").write(tflite_model)

tflite_model_path = 'gpt2-64-2.tflite'
# Load the TFLite model and allocate tensors
interpreter = tf.lite.Interpreter(model_path=tflite_model_path)
interpreter.allocate_tensors()

# Get input and output tensors
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

input_shape = input_details[0]['shape']

#print the output
input_data = np.array(np.random.random_sample((input_shape)), dtype=np.int32)
interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()
output_data = interpreter.get_tensor(output_details[0]['index'])
print(output_data.shape)
print(input_shape)

Output:

>(2, 1, 12, 1, 64)
>[1 1]

How can we fix the same ?

TFGPT2LMHeadModel to TFLite changes the input and output shape

0 Answers0