How to use the last hidden layer weights from one pre-trained MLP as input to a new MLP (transfer learning) with Keras?

Question

I want to do transfer learning with simple MLP models. First I train a 1 hidden layer feed forward network on large data:

net = Sequential()
net.add(Dense(500, input_dim=2048, kernel_initializer='normal', activation='relu'))
net.add(Dense(1, kernel_initializer='normal'))
net.compile(loss='mean_absolute_error', optimizer='adam')
net.fit(x_transf, 
        y_transf,
        epochs=1000, 
        batch_size=8, 
        verbose=0)

Then I want to pass the unique hidden layer as input to a new network, in which I want to add a second layer. The re-used layer should not be trainable.

idx = 1  # index of desired layer
input_shape = net.layers[idx].get_input_shape_at(0) # get the input shape of desired layer
input_layer = net.layers[idx]
input_layer.trainable = False

transf_model = Sequential()
transf_model.add(input_layer)
transf_model.add(Dense(input_shape[1], activation='relu'))
transf_model.compile(loss='mean_absolute_error', optimizer='adam')
transf_model.fit(x, 
                 y,
                 epochs=10, 
                 batch_size=8, 
                 verbose=0)

EDIT: The above code returns:

ValueError: Error when checking target: expected dense_9 to have shape (None, 500) but got array with shape (436, 1)

What's the trick to make this work?

The shared layer you used in the second model is expecting 2D inputs, but you are feeding the model with 3D inputs?! — today, Jan 05 '19 at 19:55
Please, somebody? It must be fairly simple to answer for someone who is familiar with Keras. — tevang, Jan 07 '19 at 10:26
The last `Dense` layer in the second model should have 1 unit, not `input_shape[0]` units, right? I think you are making it a little complicated. There are better ways of doing this. — today, Jan 07 '19 at 13:06
Could you please write an example? What changes would you do to my sample code? — tevang, Jan 07 '19 at 13:52
Sure, but could you explain what would be different in the second model? How many layers does it have? And what is the expected output shape of the model, is it `(None, 500)` or `(None, 1)`? — today, Jan 07 '19 at 14:31
Just add an extra Dense layer with 800 neurons. The output should be (None, 1), I forgot to add this before "transf_model.compile(...)": transf_model.add(Dense(1, kernel_initializer='normal')) — tevang, Jan 07 '19 at 14:33

score 1 · Accepted Answer · answered Jan 07 '19 at 14:41

1

I would simply use Functional API to build such a model:

shared_layer = net.layers[0] # you want the first layer, so index = 0
shared_layer.trainable = False

inp = Input(the_shape_of_one_input_sample) # e.g. (2048,)
x = shared_layer(inp)
x = Dense(800, ...)(x)
out = Dense(1, ...)(x)

model = Model(inp, out)

# the rest is the same...

answered Jan 07 '19 at 14:41

today

32,602
8
95
115

What if the 1st network had 2 hidden layers and I wanted to transfer both of them to the new network? – tevang Jan 07 '19 at 15:21
@tevang It is the same: you get the layers and apply them on tensors (i.e. output of previous layers). For example, `x = shared_layer1(inp)` and then `x = shared_layer2(x)`. – today Jan 07 '19 at 15:23
Yes, that's it! Thank you very much! – tevang Jan 07 '19 at 18:18

How to use the last hidden layer weights from one pre-trained MLP as input to a new MLP (transfer learning) with Keras?

1 Answers1