Is there a way in Keras to not have the y size equal to X batch size for stateful LSTM?

Question

I try to update the weight only at the end of batches and I know that this is the default behavior but I don't understand why you need to have your X and y the same size? If I have X.shape(12,32,64) in which I use the batch size 12, so just one batch why is not enough to have y.shape(1,N)?

I would like to backpropagate only after the entire batch size is shown to the network. Why to have a label for each batch item?

Example code:

def create_model(batch, timesteps, features):
    inputTensor1 = Input(batch_shape=(batch, timesteps, features))
    lstm1 = LSTM(32, stateful=True, dropout=0.2)(inputTensor1)
    x = Dense(4, activation='linear')(lstm1)
    model = Model(inputs=inputTensor1, outputs=x)
    model.compile(loss='mse', optimizer='rmsprop', metrics=['mse'])
    print(model.summary())
    plot_model(model, show_shapes=True, show_layer_names=True)

    return model

X = np.load("").reshape(1280,12,640,32)
y = np.load("").reshape(1280,1,4)

prop_train = 0.8
ntrain = int(X.shape[0]*prop_train)

X_train, X_val = X[:ntrain], X[ntrain:]
y_train, y_val = y[:ntrain], y[ntrain:]

model =create_model(12,640,32)

for j in np.arange(1):
    for i in np.arange(X_train.shape[0]):
        print(i)
        model.reset_states()
        history=model.train_on_batch(X_train[i], y_train[i])

Here I got the error

ValueError: Input arrays should have the same number of samples as target arrays. Found 12 input samples and 1 target samples

I think the number of labels should be equal to the number of the documents in your test set so it can find the correct label for the training document by using the same index. — Nick Surmanidze, Apr 21 '20 at 07:39
@NickSurmanidze My label is for the entire batch. I don't have labels for the individual data — Val Valli, Apr 21 '20 at 07:42
I don't think it will work that way. Usually, the best practice is to mix training samples with different labels and have a balanced data set where the amount of samples with different labels is equal (or using weights per label). I trained the same dataset ordered by labels and then shuffled and ordered one was giving much worse results. — Nick Surmanidze, Apr 21 '20 at 07:48
@NickSurmanidze I am not talking about ordering. I am asking about a method to control the update of the cell weights to be done only after one batch of data was processed. In this manner I would need the label just at this moment so my labels would be reduced by the batch_size. In the example I split a signal in 12 parts because was too long but my label is for the original signal. So I want to update the weight after all 12 samples are seen by the network — Val Valli, Apr 21 '20 at 09:18

Is there a way in Keras to not have the y size equal to X batch size for stateful LSTM?

0 Answers0