1

Following is the partial code. I am trying to understand what "add" does. Why is the output of Add layer (None, 38, 300) when adding two inputs with different shapes here?

Following is the code in Keras.

image_model = Input(shape=(2048,))
x = Dense(units=EMBEDDING_DIM, activation="relu")(image_model)
x = BatchNormalization()(x)

language_model = Input(shape=(MAX_CAPTION_SIZE,))
y = Embedding(input_dim=VOCABULARY_SIZE, output_dim=EMBEDDING_DIM)(language_model)
y = Dropout(0.5)(y)

merged = add([x, y])
merged = LSTM(256, return_sequences=False)(merged)
merged = Dense(units=VOCABULARY_SIZE)(merged)
merged = Activation("softmax")(merged)

enter image description here

user3267989
  • 299
  • 3
  • 18

1 Answers1

2

Why is the output of Add layer (None, 38, 300) when adding two inputs with different shapes here?

It's a technique called broadcasting. You can find more details here: https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html

In the example below, the first input(16,) is broadcast along the second dimension(2,) of the second input(2,16), so that the element-wise addition can happen.

import keras
import numpy as np

input1 = keras.layers.Input(shape=(16,))
input2 = keras.layers.Input(shape=(2,16))
added = keras.layers.Add()([input1, input2])
model = keras.models.Model(inputs=[input1, input2], outputs=added)
output = model.predict([np.ones((1,16)), np.ones((1,2,16))])
print(output.shape)
print(output)

(1, 2, 16)

[[[2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2.] [2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2.]]]

Manoj Mohan
  • 5,654
  • 1
  • 17
  • 21