I am trying to train an LSTM with keras using TensorFlow backend on toy data and am getting this error:
ValueError: Error when checking target: expected dense_39 to have 2 dimensions, but got array with shape (996, 1, 1)
The error occurs immediately upon calling model.fit
; nothing seems to run. It seems to me that Keras is checking dimensions, but ignoring the fact that it should be taking batches of my target with each batch of my input. The error shows the full dimension of my target array, which implies to me that it's never split into batches by Keras, at least while checking dimensions. For the life of me I can't figure out why this would be or anything else that might help.
My network definition with expected layer output shapes in comments:
batch_shape = (8, 5, 1)
x_in = Input(batch_shape=batch_shape, name='input') # (8, 5, 1)
seq1 = LSTM(8, return_sequences=True, stateful=True)(x_in) # (8, 5, 8)
dense1 = TimeDistributed(Dense(8))(seq1) # (8, 5, 8)
seq2 = LSTM(8, return_sequences=False, stateful=True)(dense1) # (8, 8)
dense2 = Dense(8)(seq2) # (8, 8)
out = Dense(1)(dense2) # (8, 1)
model = Model(inputs=x_in, outputs=out)
optimizer = Nadam()
model.compile(optimizer=optimizer, loss='mean_squared_error')
model.summary()
The model summary, shapes as expected:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input (InputLayer) (8, 5, 1) 0
_________________________________________________________________
lstm_28 (LSTM) (8, 5, 8) 320
_________________________________________________________________
time_distributed_18 (TimeDis (8, 5, 8) 72
_________________________________________________________________
lstm_29 (LSTM) (8, 8) 544
_________________________________________________________________
dense_38 (Dense) (8, 8) 72
_________________________________________________________________
dense_39 (Dense) (8, 1) 9
=================================================================
Total params: 1,017
Trainable params: 1,017
Non-trainable params: 0
_________________________________________________________________
My toy data, where the target is just a line decreasing from 100 to 0, and the input is just an array of zeros. I want to do one-step-ahead prediction, so I create rolling windows of my input and target using a rolling_window()
method defined below:
target = np.linspace(100, 0, num=1000)
target_rolling = rolling_window(target[4:], 1)[:, :, None]
target_rolling.shape # (996, 1, 1) <-- this seems to be the array that's causing the error
x_train = np.zeros((1000,))
x_train_rolling = rolling_window(x_train, 5)[:, :, None]
x_train_rolling.shape # (996, 5, 1)
The rolling_window()
method:
def rolling_window(arr, window):
shape = arr.shape[:-1] + (arr.shape[-1] - window + 1, window)
strides = arr.strides + (arr.strides[-1],)
return np.lib.stride_tricks.as_strided(arr, shape=shape, strides=strides)
And my training loop:
reset_state = LambdaCallback(on_epoch_end=lambda _, _: model.reset_states())
callbacks = [reset_state]
history = model.fit(x_train_rolling, y_train_rolling,
batch_size=8,
epochs=100,
validation_split=0.,
callbacks=callbacks)
I have tried:
- Non-stateful LSTM, but I really need stateful for the eventual application. Same error.
return_sequence=True
in the second LSTM with aFlatten
layer after. Same error.return_sequence=True
without aFlatten
layer. This gives a different error because it is expecting a target with the same shape as the output, which at that point is(batch_size, 5, 1)
and not(batch_size, 1, 1)
.- Running the same architecture on the whole sequence at once (batch size of 1), without rolling windows. This works, but just learns to approximate the mean of my target and is useless for my purposes.
Note that none of these questions seem to directly answer mine, although I was really hopeful on a couple:
- Error when checking target: expected time_distributed_5 to have 3 dimensions, but got array with shape (14724, 1)
- LSTM and CNN: ValueError: Error when checking target: expected time_distributed_1 to have 3 dimensions, but got array with shape (400, 256)
- ValueError: Error when checking target: expected lstm_27 to have 2 dimensions, but got array with shape (1, 11, 1)
- expected dense_218_input to have 2 dimensions, but got array with shape (512, 28, 28, 1)
- expected dense_1 to have 2 dimensions, but got array with shape (308, 1, 6)