DQN agent with vector input and vector output

Question

I am a beginner at Reinforcement Learning and Deep Learning and I want to built a neural network for a DQN agent (in Keras) that receives a vector as input of length equal to 3 and outputs another vector of length equal to 10.

The input vector has one element that is equal to 1 and the other elements are equal to 0. It can also be all zeros, but it cannot have more than one element with the value 1.

Example:

[0, 1, 0]

Or:

[0, 0, 0]

The output must be a vector with 10 elements, one of the elements is equal to 1 and all the other elements have a value equal to 0. And just like the input vector, it can also be all zeros, but it cannot have more than one element with the value 1.

Example:

[0, 0, 1, 0, 0, 0, 0, 0, 0, 0]

Or:

[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

If the input vector has a '1' in it, then the output vector must have at most one column with the value of 1.

If the input vector is all zeros, then the output vector must have all the values set to 0s.

I have tried to create a convolutional neural network, but the examples I've come across treat images (hence 2D matrices) as inputs and have one value as output and not a vector.

You have not really described a problem or made a question, so the issue is unclear to me. Can you clarify? — Dr. Snoopy, Nov 18 '20 at 16:18
I edited my post a little bit. I described the problem being the agent with a vector as input and another vector as output and the desired behavior. — Ness, Nov 18 '20 at 16:26

score 1 · Answer 1 · answered Nov 18 '20 at 23:45

DQN is strongly based on Markov Decision Process, so the concepts of what is state, actions, and rewards must be clear to define one.

For me, seems that your input is a state codified in one-hot encoding. To chose architecture for this problem you need to provide more details. It could be a LSTM layer for time series for example or a simple dense layer.

The output in DQN is always single and continuous value, which represents how good is being in some state and performs an action a, called Q-value. For me, what you mean by output, in fact, are your actions.

I strongly recommend you to follow this material here to understand each component of Markov Decision Process and then dive into DQN approach.

DQN agent with vector input and vector output

1 Answers1