How to resolve array input shape errors for LSTM DQN?

Question

Am building a DQN with LSTM layers. Trying to pass 96timeperiod, 33feature arrays to the model, for training ie: shape=(96, 33) Also trying to implement a post-padded mask (val=0.) to accomodate variable length sequences (max length=96).

model = Sequential()
inp = (NUM_TIMEPERIODS, NUM_FEATURES)
model.add(Masking(mask_value=0., input_shape=inp))
model.add(LSTM(NUM_FEATURES, input_shape=inp, activation='tanh', return_sequences=True))
model.add(LSTM(NUM_FEATURES, activation='tanh'))
model.add(Dense(NUM_FEATURES, activation='relu'))
model.add(Dense(4, activation='softmax'))
model.compile(loss='sparse_categorical_crossentropy',
              optimizer=Adam(lr=LEARNING_RATE, decay=DECAY),
              metrics=['accuracy'])

When I submit a sequence called current_states, with shape (96, 33), using:

current_qs_list = self.model.predict(current_states)

to generate q-values, returns the error:

ValueError: Input 0 of layer lstm is incompatible with the layer: 
expected ndim=3, found ndim=2. Full shape received: [32, 33]

I think the 32 is masked length (out of max length 96) of this first sequence being submitted to the model, and dumped... Have tried to add an input layer immediately before the masking layer:

model.add(Input(batch_size=None, shape=inp))

but no solution, only more errors. How to rewrite the model input layers to receive and train on the [96,33] array please? Or do I need to 'consolidate a batch of sequences (eg: 4 sequences) into an array as [4, 96, 33] and then submit to the model?

score 0 · Answer 1 · answered Aug 02 '20 at 11:24

The only possible working solution I have come up with is to combine two or more (96, 33) arrays into a minibatch:

minibatchSize = 2
current_states = np.concatenate(array1(96,33), array2(96,33)).reshape(minbatchSize, 
                 NUM_TIMEPERIODS, NUM_FEATURES)

and then submit to the model, which has an added Input layerjust before Masking:

model.add(Input(batch_size=minibatchSize, shape=inp))

Just cannot get it to work without specifying a non-zero or non-1 batchsize and specifying that batchsize in an input layer. Must use batches...

How to resolve array input shape errors for LSTM DQN?

1 Answers1