1

I have a simple GRU network coded with Keras in python as below:

gru1  = GRU(16, activation='tanh', return_sequences=True)(input)
dense  = TimeDistributed(Dense(16, activation='tanh'))(gru1)
output = TimeDistributed(Dense(1, activation="sigmoid"))(dense)

I've used a sigmoid activation for output since my purpose is classification. But I need to use the same model for regression as well. I'll need to change the output activation as linear. However, the rest of the network is still the same. So in this case, I'll use two different networks for two different purposes. Inputs are the same. But outputs are classes for sigmoid and values for linear activation.

My question is, is there any way to use only one network but get two different outputs at the end? Thanks.

Zabir Al Nazi
  • 10,298
  • 4
  • 33
  • 60
ICHaLiL
  • 29
  • 3

1 Answers1

0

Yes, you can use functional API to design a multi-output model. You can keep shared layers and 2 different outputs one with sigmoid another with linear activation.

N.B: Don't use input as a variable, it's a function name in python.

from tensorflow.keras.layers import *
from tensorflow.keras.models import Model
seq_len = 100 # your sequence length
input_ = Input(shape=(seq_len,1))
gru1  = GRU(16, activation='tanh', return_sequences=True)(input_)
dense  = TimeDistributed(Dense(16, activation='tanh'))(gru1)
output1 = TimeDistributed(Dense(1, activation="sigmoid", name="out1"))(dense)
output2 = TimeDistributed(Dense(1, activation="linear", name="out2"))(dense)

model = Model(input_, [output1, output2])

model.summary()
Model: "model_1"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_3 (InputLayer)            [(None, 100, 1)]     0                                            
__________________________________________________________________________________________________
gru_2 (GRU)                     (None, 100, 16)      912         input_3[0][0]                    
__________________________________________________________________________________________________
time_distributed_3 (TimeDistrib (None, 100, 16)      272         gru_2[0][0]                      
__________________________________________________________________________________________________
time_distributed_4 (TimeDistrib (None, 100, 1)       17          time_distributed_3[0][0]         
__________________________________________________________________________________________________
time_distributed_5 (TimeDistrib (None, 100, 1)       17          time_distributed_3[0][0]         
==================================================================================================
Total params: 1,218
Trainable params: 1,218
Non-trainable params: 0

Compiling with two loss functions:

losses = {
    "out1": "binary_crossentropy",
    "out2": "mse",
}

# initialize the optimizer and compile the model

model.compile(optimizer='adam', loss=losses, metrics=["accuracy", "mae"])
Zabir Al Nazi
  • 10,298
  • 4
  • 33
  • 60
  • Thank you for your comment Zabir. There's another point here. I need to train these two model with different loss functions. MSE for linear activation and Binary CE for sigmoid activation. Is it possible with this structure? – ICHaLiL Apr 24 '20 at 21:40
  • definitely, you can check my updated answer. if it helps, don't hesitate to upvote/accept. – Zabir Al Nazi Apr 24 '20 at 22:09