I want to train a Keras LSTM using a batch size of one. In this case I wouldn't need zero padding so far I understood. The necessity for zero padding comes from equalizing the size of the batches, right?
It turns out this is not as easy as I thought.
My network looks like this:
model = Sequential()
model.add(Embedding(output_dim=embeddings.shape[1],
input_dim=embeddings.shape[0],
weights=[embeddings],
trainable=False))
model.add(Dropout(0.2))
model.add(Bidirectional(LSTM(100,
return_sequences=True,
activation="tanh",kernel_initializer="glorot_uniform")))
model.add(TimeDistributed(Dense(maxLabel)))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='sgd',
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=20, batch_size=1, shuffle=True)
I naively provide my training data and training labels as simple numpy array with this shape properties: X: (2161,) Y: (2161,)
I get now a ValueError: Error when checking target: expected activation_1 to have 3 dimensions, but got array with shape (2161, 1)
I am not sure how to satisfy this 3-D property without zero-padding which is what I want to avoid when working with batch sizes of one anyway.
Does anyone see what I am doing wrong?