How to select an action from a matrix in Q learning when using multiple frames as input

Question

When using deep q-learning I am trying to capture motion by passing a number of grayscale frames as the input, each with the dimensions 90x90. There will be four 90x90 frames passed in to allow the network to detect motion. The multiple frames should be considered a single state rather than a batch of 4 states, how can I get a vector of actions as a result instead of a matrix?

I am using pytorch and it will return a matrix of 4x7 - a row of actions for each frame. here is the network:

        self.conv1 = Conv2d(self.channels, 32, 8)
        self.conv2 = Conv2d(32, 64, 4)
        self.conv3 = Conv2d(64, 128, 3)
        self.fc1 = Linear(128 * 52 * 52, 64)
        self.fc2 = Linear(64, 32)
        self.output = Linear(32, action_space)

Easy but not elegant would be to just concatenate four frames in the first (the channel) dimension. More complex would be 3d convolutions (computationally expensive) or recurrent neural networks over the temporal dimension (just naming some examples here...) — Jan, Jun 21 '20 at 15:06

score 0 · Answer 1 · edited Jun 21 '20 at 19:13

0

Select the action with the highest value. Let's call the output tensor be called action_values.

action=torch.argmax(action_values.data)

or

action=np.argmax(action_values.cpu().data.numpy())

edited Jun 21 '20 at 19:13

David Buck

3,752
35
31
35

answered Jun 21 '20 at 16:20

risper

58
7

As it is a matrix argmax will return a vector of actions, I would have to use argmax twice to get a single value and I'm not sure doing that makes sense – Ryan McCauley Jun 21 '20 at 19:14
Hello @RyanMcCauley, I think there are 4 actions that are continuous ... each action is represented as a vector. – risper Jul 23 '20 at 17:03

How to select an action from a matrix in Q learning when using multiple frames as input

1 Answers1