3

For example I have a model with 3 intermediate layers:

Model1 : Input1 --> L1 --> L2 --> L3,

and want to split it into

Model2 : Input2 --> L1 --> L2

and Model3 : Input3 --> L3.

It is easy to stack these two to get the first one using functional API. But I'm not sure how to do the opposite thing.

The first split model can be obtained by: Model(Input1, L2.output), but the second one is not that easy. What is the simplest way to do this?

Example code:

# the first model
input1 = Input(shape=(784,))
l1 = Dense(64, activation='relu')(inputs)
l2 = Dense(64, activation='relu')(l1)
l3 = Dense(10, activation='softmax')(l2)
model1 = Model(inputs, l3)

I want to build model2 and model3 described above that share weights with model1 while model1 already exists (maybe loaded from disk).

Thanks!

Jeff Dong
  • 73
  • 1
  • 6
  • to clarify Model(Input3, L3) wouldnt work? why? – parsethis Apr 11 '17 at 15:09
  • can you write some code as an example of what you're trying to achieve? – parsethis Apr 11 '17 at 15:10
  • @putonspectacles thanks for reply. Model(Input3, L3) is not possible since Input3 does not exists and L3's input is L2's output. I think the key here is how to reset L3's Input properly. I updated some Example codes. – Jeff Dong Apr 11 '17 at 23:57

1 Answers1

4

In short, extra Input is needed. Because the input tensor is different from the intermediate tensor.

First define the shared layers: l1 = Dense(64, activation='relu') l2 = Dense(64, activation='relu') l3 = Dense(10, activation='softmax')

Remember that input1 = Input(shape=(784,)) # input1 is a input tensor o1 = l1(input1) # o1 is an intermediate tensor

Model1 can be defined as model1 = Model(input1, l3(l2(l1(input1))) )

To define model2, you have to first define a new input tensor input2=Input(shape=(64,)). Then model2 = Model(input2, l3(l2(input2)).

Van
  • 3,749
  • 1
  • 15
  • 15
  • Thanks a lot. This works for me. But just a further question, could it be simpler? Since This method needs to reconstruct the entire computational graph of model2, which is already contained in model1. When model1 is a loaded pre-trained model and very complicated, extracting all intermediate layers and rebuilding model2 may be hard. An ideal situation is to just specify two point in model1, and somehow 'extract' the in-between part as a submodel. Is it possible with keras? – Jeff Dong Apr 12 '17 at 13:01
  • To my best knowledge, you cannot do that. Because keras depends on TF or Theano, and those framework treat input tensor with intermediate tensor differently. – Van Apr 19 '17 at 09:36