Multiple Input types in a keras Neural Network

Question

As an example, I'd like to train a neural network to predict the location of a picture(longitude, latitude) with the image, temperature, humidity and time of year as inputs into the model.

My question is, what is the best way to add this addition information to a cnn? Should I just merge the numeric inputs with the cnn in the last dense layer or at the beginning? Should I encode the numeric values (temperature, humidity and time of year)?

Any information, resources, sources would be greatly appreciated, thanks in advance.

score 4 · Answer 1 · edited Jun 24 '22 at 15:49

4

You can process numeric inputs separately and merge them afterwards before making the final prediction:

# Your usual CNN whatever it may be
img_in = Input(shape=(width, height, channels))
img_features = SomeCNN(...)(img_in)

# Your usual MLP model
aux_in = Input(shape=(3,))
aux_features = Dense(24, activation='relu')(aux_in)

# Possibly add more hidden layers, then merge
merged = concatenate([img_features, aux_features])

# create last layer.
out = Dense(num_locations, activation='softmax')(merged)

# build model
model = Model([img_in, aux_in], out)
model.compile(loss='categorical_crossentropy', ...)

Essentially, you treat them as separate inputs and learn useful features that combined allow your model to predict. How you encode numeric inputs really depends on their type.

For continuous inputs like temperature you can normalize between -1, 1 for discrete inputs one-hot is very often. Here is a quick guide.

edited Jun 24 '22 at 15:49

Innat

16,113
6
53
101

answered May 27 '18 at 21:56

nuric

11,027
3
27
42

Thanks a lot, this will be very helpful. One more question, are simple Dense layers the best for numeric data or does it make sense to use more complex layers found in cnns(pooling, batch, conv layer, etc)? – user3029296 May 27 '18 at 22:49
@nuric, Thanks a lot. Explained pretty well. – sahaj patel Aug 29 '19 at 15:26
@nuric How do you keep the consistency between an image and its corresponding numeric inputs while training? – bit_scientist Feb 11 '20 at 01:53

score 0 · Answer 2 · answered May 27 '18 at 23:18

0

If you want to predict basis on those four features then i would suggest go with cnn + rnn

so feed the image to cnn and take the logits after that make a sequence like

logits=np.array(output).flatten()

[[logits] , [temperature], [humidity] , [time_of_year]] and feed it to

rnn , Rnn will treat it like a sequence input.

answered May 27 '18 at 23:18

Aaditya Ura

12,007
7
50
88

Could you shed some more light on what you suggested? At what stage are the logits taken to use in RNN afterwards? – bit_scientist Feb 11 '20 at 01:56

Multiple Input types in a keras Neural Network

2 Answers2