0

I'm using a Pytorch DQN for Reinforcement Learning of card games. I use a Convolution layer to detect sequence and suit-related patterns in a 4x13 "image" of the state of the deck of cards. I then flatten the output of this layer, flatten the original 4x13 state of the deck (which was the input to the Conv layer), and combine the two flattened tensors into a single linear input to the remaining Linear layers of my network.

My question is whether it is necessary to "unwind" this operation when my Loss is propagated to the network using backward(). It seems possible that the gradient tensors which make up the Loaa somehow already understand where they came from. If not it seems that I will need to intervene at the right moment in backward() (as I do in forward) and disentangle things.

This is conceptually a layer with multiple inputs. I have seen some examples of these but I've not been able to find one that seems to address this particular issue.

The model trains without runtime errors, but I am concerned that the Conv layer is not getting the proper back-propagation.

black-ejs
  • 3
  • 3

0 Answers0