DQN is a multi-layered neural network, added target network and experience replay to Q-learning
Questions tagged [dqn]
206 questions
0
votes
1 answer
Training Snake to eat food in specific number of steps, using Reinforcement learning
I am trying my hands on Reinforcement/Deep-Q learning these days. And I started with a basic game of 'Snake'.
With the help of this article:…

Mohak Shukla
- 77
- 1
- 7
0
votes
0 answers
Is there any method in reinforcement Learning to select multiple simultaneous actions?
I'm working on a research project that involves the application of reinforcement learning to planning and decision-making problems. Typically, these problems involve picking (sampling) multiple actions within a state based on ranking [max_q to…

M.zubair Islam
- 1
- 6
0
votes
1 answer
DQN unstable predictions
i implemented DQN from scratch in java, everything is custom made. I made it to play snake and results are really good. But i have a problem.
To make network as stable as possible, im using replay memory and also target network. The network is…

MrHolal
- 329
- 3
- 5
0
votes
2 answers
TypeError: __init__() missing 1 required positional argument: 'units' when using the NoisyDense Class
I am trying to implement Noisy Nets in my model. I found a code on GitHub which is an implementation of NoisyDense Class.
I used this class inside my model.
Here the code: -
class Agent:
def __init__(self, state_size, strategy="t-dqn",…

Raj Shah
- 13
- 1
- 4
0
votes
1 answer
Why is CNN convolution output size in PyTorch DQN tutorial computed with `kernel_size -1`?
Based on my understanding, CNN output size for 1D is
output_size = (input_size - kernel_size + 2*padding)//stride + 1
Refer to PyTorch DQN Tutorial. In the tutorial, it uses 0 padding, which is fine. However, it computes the output size as…

Shern
- 712
- 1
- 8
- 20
0
votes
1 answer
Why does randomizing samples of reinforcement learning model with a non-linear function approximator reduce variance?
I have read the DQN thesis.
While reading the DQN paper, I found that randomly selecting and learning samples reduced divergence in RL using a non-linier function approximator.
If so, why is the learning of RL using a non-linier function…

강문주
- 3
- 1
0
votes
0 answers
Neural Network output shape mismatch
So I'm building my first own simple DQN neural network. But I'm really struggling with the output shape of my network.
I have an input with 139 features making it input_shape=(None,139) and a batch size of 64. I have 4 outputs for the last layer, as…

Capt_Bender
- 71
- 2
- 8
0
votes
1 answer
Formulation of a reward structure
I am new to reinforcement learning and experimenting with training of RL agents.
I have a doubt about reward formulation, from a given state if a agent takes a good action i give a positive reward, and if the action is bad, i give a negative reward.…

chink
- 1,505
- 3
- 28
- 70
-1
votes
1 answer
should dqn state values need to be 0 to 1 only
should the values of the state in DQN need to be only 0 to 1 for example
state = [0, 0, 0, 1, 1, 1, 1, 0, 1, 0]
or it can have a state with values greater than 1 eh
state = [6, 5, 4, 1, 1, 1, 2, 3, 15, 10]

KKK
- 507
- 3
- 12
-2
votes
1 answer
About reward policy in a DQN model
I’m wondering about the reward policy in a DQN model. I’m learning how to use DQN for solving cases. So, I’m applying DQN in a deterministic case that I know already the answer.
I’m developing a DQN model that finds the optimal threshold to obtain…

Jesús Valencia
- 13
- 5