0

Am building a DQN with LSTM layers. Trying to pass 96timeperiod, 33feature arrays to the model, for training ie: shape=(96, 33) Also trying to implement a post-padded mask (val=0.) to accomodate variable length sequences (max length=96).

model = Sequential()
inp = (NUM_TIMEPERIODS, NUM_FEATURES)
model.add(Masking(mask_value=0., input_shape=inp))
model.add(LSTM(NUM_FEATURES, input_shape=inp, activation='tanh', return_sequences=True))
model.add(LSTM(NUM_FEATURES, activation='tanh'))
model.add(Dense(NUM_FEATURES, activation='relu'))
model.add(Dense(4, activation='softmax'))
model.compile(loss='sparse_categorical_crossentropy',
              optimizer=Adam(lr=LEARNING_RATE, decay=DECAY),
              metrics=['accuracy'])

When I submit a sequence called current_states, with shape (96, 33), using:

current_qs_list = self.model.predict(current_states)

to generate q-values, returns the error:

ValueError: Input 0 of layer lstm is incompatible with the layer: 
expected ndim=3, found ndim=2. Full shape received: [32, 33]

I think the 32 is masked length (out of max length 96) of this first sequence being submitted to the model, and dumped... Have tried to add an input layer immediately before the masking layer:

model.add(Input(batch_size=None, shape=inp))

but no solution, only more errors. How to rewrite the model input layers to receive and train on the [96,33] array please? Or do I need to 'consolidate a batch of sequences (eg: 4 sequences) into an array as [4, 96, 33] and then submit to the model?

MarkD
  • 395
  • 3
  • 14

1 Answers1

0

The only possible working solution I have come up with is to combine two or more (96, 33) arrays into a minibatch:

minibatchSize = 2
current_states = np.concatenate(array1(96,33), array2(96,33)).reshape(minbatchSize, 
                 NUM_TIMEPERIODS, NUM_FEATURES)

and then submit to the model, which has an added Input layerjust before Masking:

model.add(Input(batch_size=minibatchSize, shape=inp))

Just cannot get it to work without specifying a non-zero or non-1 batchsize and specifying that batchsize in an input layer. Must use batches...

MarkD
  • 395
  • 3
  • 14