I am trying to convert tf.keras.layers.CuDNNLSTM
to tf.keras.layers.LSTM
(and tf.keras.layers.CuDNNGRU
to tf.keras.layers.GRU
).
I see from this StackOverflow post and this GitHub PR that the CuDNNLSTM --> LSTM conversion is possible, and I see from this GitHub PR that the conversion is possible, but I'm seeing some unexpected behavior during inference, so I'm not sure if I'm doing it correctly.
The way I am doing this now is through creating (and training) a model which has a CuDNNLSTM
layer...
inputs = tf.keras.Input(shape=(120,))
x = layers.CuDNNLSTM(100)(inputs) # or x = layers.CuDNNGRU(100)(inputs)
outputs = layers.Dense(5, activation='softmax')(x)
model = tf.keras.Model(inputs=inputs, outputs=outputs)
... compiling and training the model ...
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(...)
... saving the model as both a json for the structure and an hdf5 file for weights, and editing the file to say it's an LSTM
layer instead of a CuDNNLSTM
layer.
model.save_weights('/path/to/weights.h5')
# get json, edit layer name, dump into file
json_config = model.to_json()
with open('/path/to/structure.json', 'w') as json_file:
json_config_ = json.loads(json_config)
layers = json_config_['config']['layers']
for layer in layers:
if layer['class_name'].lower() == 'cudnnlstm':
layer['class_name'] = 'LSTM'
elif layer['class_name'].lower() == 'cudnngru':
layer['class_name'] = 'GRU'
json.dump(json_config_, json_file)
I am taking '/path/to/weights.h5'
and '/path/to/structure.json'
and converting them to coreml to do inference on an iphone, so it might be failing during the coreml conversion. I want to check if I am doing the conversion from CuDNN layers to LSTM/GRU layers correctly first, before investigating the coreml converter. Is this the correct way to convert from CuDNNLSTM to LSTM and from CuDNNGRU to GRU? Is there another way to do the conversion that I should be doing instead?