Training and saving very large models in Keras

Question

I'm building an LSTM model with attention in Keras for multi-label classification, but there are thousands of possible output labels, each with its own sigmoid prediction layer and unique attention MLP layer. Is it possible to train and save such a large model? I'm getting the following h5py RuntimeError: Unable to create attribute (Object header message is too large).

You need to generate data in batches and train it on batches. For saving the model, just save the best weights only, not the entire model — enterML, May 12 '17 at 14:50
I tried training in batches using train_on_batch() and saving the model with model.to_json() and model.save_weights(), but it wasn't able to save the weights for the full model. What do you mean by saving only the best weights? — MBK, May 13 '17 at 15:09
You question is about large *models*, correct? I.e., models that have hundreds of thousands of connections? Batch training only helps with large datasets, not large models. I too am running into the same problem, training a multi-layer LSTM network with extremely wide layers (100,000+ nodes per layer). Did you find a solution? — clemej, Jun 21 '17 at 14:42
Yes, I was asking about large models, not large datasets, but unfortunately have not found a solution yet. Please let me know what you discover! My next idea was to divide the model into thousands of different models, each with their own attention mechanism, but we probably do not have sufficient data to train each model separately which is why I have not tried it yet. — MBK, Jun 22 '17 at 17:29
###USE get_weights AND set_weights TO SAVE AND LOAD MODEL, RESPECTIVELY. Have a look at this link: https://stackoverflow.com/questions/16639503/unable-to-save-dataframe-to-hdf5-object-header-message-is-too-large — Anurag Gupta, Feb 15 '19 at 08:56

NKSHELL · Answer 1 · 2018-08-12T02:47:38.027

You may already know about HDF5's header limitation. Look here for more info.

So I ran to the same problem, and I solved it with a little trick. Change your layers' names to some small strings before saving them. I did that like this:

for i, m in enumerate(model.layers):
    m.name = 'n' + str(i)

And it worked (don't let that 'n' confuse you. I just wanted my layer names to start with a character, instead of a number). Note that layer names should be unique, and str(i) solves that. If you need your layers' names after loading your model later, you can create a dictionary for them and save the dictionary in a text file. After loading your model, read the dictionary from the text file and use it to map current layer names to the original ones. For example, create the dictionary like this:

dic = {}
for i, m in enumerate(model.layers):
    dic['n' + str(i)] = m.name
    m.name = 'n' + str(i)

and use it later like this:

for m in model.layers:
    m.name = dic[m.name]

Training and saving very large models in Keras

1 Answers1