18

I am training a LSTM model with Keras on the dataset which looks like following. The variable "Description" is a text field and "Age" and "Gender" are categorical and continuous fields.

Age, Gender, Description
22, M, "purchased a phone"
35, F, "shopping for kids"

I am using word-embedding to convert the text fields to word vectors and then input it in the keras model. The code is given below:

model = Sequential()
model.add(Embedding(word_index, 300, weights=[embedding_matrix], input_length=70, trainable=False))

model.add(LSTM(300, dropout=0.3, recurrent_dropout=0.3))
model.add(Dropout(0.6))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics['accuracy'])

This model is running successfully but I want to input "age" and "gender" variables as features as well. What changes are required in the code to use these features as well ?

userxxx
  • 796
  • 10
  • 18

3 Answers3

14

You want to add more input layers which is not possible with Sequential Model, you have to go for functional model

from keras.models import Model

which allows you to have multiple inputs and indirect connections.

embed = Embedding(word_index, 300, weights=[embedding_matrix], input_length=70, trainable=False)
lstm = LSTM(300, dropout=0.3, recurrent_dropout=0.3)(embed)
agei = Input(shape=(1,))
conc = Concatenate()(lstm, agei)
drop = Dropout(0.6)(conc)
dens = Dense(1)(drop)
acti = Activation('sigmoid')(dens)

model = Model([embed, agei], acti)
model.compile(loss='binary_crossentropy', optimizer='adam', metrics['accuracy'])

You cannot concatenate before LSTM layer as it doesn't make sense and also you will have 3D Tensor after embedding layer and input is a 2D Tensor.

Andrew Lavers
  • 4,328
  • 1
  • 12
  • 19
Suba Selvandran
  • 304
  • 2
  • 16
  • `agei` is the input for the `age` and `gender`, right? If so, how can I use my `input.csv` that has age and gender information? – Abu Shoeb Sep 01 '18 at 07:13
  • @userxxx hey can you please share your full code? I want to use the additional features with word embedding. – Abu Shoeb Sep 01 '18 at 21:57
  • You can find a complete example with an actual dataset here: https://stackabuse.com/python-for-nlp-creating-multi-data-type-classification-models-with-keras/ – Sarah Oct 10 '20 at 03:16
6

I wrote about how to do this in keras. It's basically a functional multiple input model, which concatenates both feature vectors like this:

nlp_input = Input(shape=(seq_length,), name='nlp_input')
meta_input = Input(shape=(10,), name='meta_input')
emb = Embedding(output_dim=embedding_size, input_dim=100, input_length=seq_length)(nlp_input)
nlp_out = Bidirectional(LSTM(128))(emb)
x = concatenate([nlp_out, meta_input])
x = Dense(classifier_neurons, activation='relu')(x)
x = Dense(1, activation='sigmoid')(x)
model = Model(inputs=[nlp_input , meta_input], outputs=[x])
ixeption
  • 1,972
  • 1
  • 13
  • 19
  • Thanks for the great info. I am following your code for my use case and getting `InvalidArgumentError: input 1 should contain 3 elements, but got 2[[{{node training/Adam/gradients/concatenate_1/concat_grad/ConcatOffset}}]]`. Do you have idea why? I checked here and on Github for hours, but no luck so far. – KLaz Aug 20 '19 at 21:21
  • @ixeption Read your article. What if you have two text columns like for instance [title, article] - in a multi classification prob. [agree, disagree, discuss, unrelated]. Do you create embeddings for each one and then pass it to one single LSTM? I understand from above that it's pointless to concat. prior to LSTM, but how would you pass two different embeddings to an LSTM? Is it simple like LSTM()([embed1, embed2]) – StackPancakes Aug 24 '20 at 20:20
  • 1
    @StackPancakes I would use the same embeddings and LSTM for both nlp inputs and concatenate the output of the LSTM together. You can use TimeDistributed Layer and you dimension would be 2. I did something similar with [Chats](http://digital-thinking.de/deepchat-recurrent-neural-networks-for-dialog-classification/). – ixeption Aug 26 '20 at 14:57
2

Consider having a separate feedforward network that takes in those features and outputs some n dimensional vector.

time_independent = Input(shape=(num_features,))
dense_1 = Dense(200, activation='tanh')(time_independent)
dense_2 = Dense(300, activation='tanh')(dense_1)

Firstly, please use keras' functional API to do something like this.

You would then either pass this in as the hidden state of the LSTM, or you can concatenate it with every word embedding so that the lstm sees it at every timestep. In the latter case, you would want to drastically reduce the dimensionality of the network.

If you need an example, let me know.

modesitt
  • 7,052
  • 2
  • 34
  • 64
  • 2
    An example will be helpful – userxxx Mar 08 '18 at 15:32
  • @modesitt could you please give a complete example? I need to use additional features with word-embeddings. My additional features are stored in `input.csv` file and it has 3 columns. – Abu Shoeb Sep 01 '18 at 07:17