DQN is a multi-layered neural network, added target network and experience replay to Q-learning
Questions tagged [dqn]
206 questions
1
vote
1 answer
OpenAI Gym LunarLander execution considerably slowed down for an unknown reason
I have been playing around with the OpenAI Gym LunarLander testing a DQN neural network. I had gotten to a point where it was slowly learning. Since I had started with the CartPole problem which was solved in a couple of minutes/episodes, I…

Max Michel
- 575
- 5
- 20
1
vote
0 answers
Q learning state actions pair, what is 'state' exactly?
Hello all I'm trying to write a deep q learning network, I am not using any sort of gym environment or anything, just a cnn using screen grab.
since i'm not using gyms nicely coded user friendly environments. what do I actually save for my…

oz.vegas
- 129
- 7
1
vote
0 answers
why there is so no convergence although you can see that the agent is learning?
Recently i have been tackling a problem with reinforcement learning (im very new in the field). The problem is simple. we have a matrix of size NxN with all elements equal to zero. the goal is for the agent to change all values to 1. I can see that…

hosseinoj
- 23
- 2
1
vote
1 answer
how should i define the state for my gridworld like environment?
The problem i want to solve is actually not this simple, but this is kind of a toy game to help me solve the greater problem.
so i have a 5x5 matrix with values all equal to 0 :
structure = np.zeros(25).reshape(5, 5)
and the goal is for the agent…

hosseinoj
- 23
- 2
1
vote
0 answers
Deep Value-only Reinforcement Learning: Train V(s) instead of Q(s,a)?
Is there a value-based (Deep) Reinforcement Learning RL algorithm available that is centred fully around learning only the state-value function V(s), rather than to the state-action-value function Q(s,a)?
If not, why not, or, could it easily be…

FlorianH
- 600
- 7
- 18
1
vote
1 answer
Implementing Dueling DQN on TensorFlow 2.0
I'm trying to implement my own Dueling DQN using tensorflow 2 based on https://arxiv.org/pdf/1511.06581.pdf. I'm actually training it on the Atlantis environment but I can't get good results (Mean reward per game keeps decreasing while TD loss…

jh1783
- 21
- 6
1
vote
1 answer
DOUBLE DQN doesn't make any sense
Why use 2 networks, train once every episode and update target network every N episode, when we can use 1 network and train it ONCE every N episode! there is literally no difference!
user12592480
1
vote
1 answer
Why Deep Q networks algorithm performs only one gradient descent step?
Why dqn algorithm performs only one gradient descent step, i.e. trains for only one epoch? Would not it benefit from more epochs, won’t its accuracy improve with more epochs?

Mika
- 231
- 1
- 10
1
vote
1 answer
Several dips in accumulated episodic rewards during training of a reinforcement learning agent
Hi I am training reinforcement learning agents for a control problem using PPO algorithm. I am tracking the accumulated rewards for each episode during the training process. Several times during the training process I see a sudden dip in the…

chink
- 1,505
- 3
- 28
- 70
0
votes
0 answers
I don't understand how to get a dataset for more complex environments for trying to train a DQN and find what hyperparameters/ optimizers to use
In videos like this, it talks about how you need to find the right fit for your data which is shown on a scatter plot. I understand how this works when you have a dataset for something but how does it work when you are trying to train a DQN to play…
0
votes
1 answer
Difficulty Implementing DQN for Gym's Taxi-v3 Problem
I've been working on solving the Gym Taxi-v3 problem using reinforcement learning algorithms. Initially, I applied tabular Q-learning and after 10,000 training iterations, the algorithm achieved a mean reward of 8.x with 100% success rate, which was…

Aaron
- 11
- 3
0
votes
0 answers
HER(Hindsight Experience Replay) with ACME DQN Agent Running into issues
Im doing some experiments with a project utilizing acme with the tensorflow version. We wanted to do some additional experiments utilizing HER(Hindsight Experience Replay).
I have been working on including that but struggling to get it to work. Im…

Lucas Hendren
- 2,786
- 2
- 18
- 33
0
votes
0 answers
TypeError: 'NoneType' object is not iterable in batch = Transition(*zip(*transitions))
There is an error in the line batch = Transition(*zip(*transitions)).
TypeError: 'NoneType' object is not iterable
def optimize_model():
if len(memory) < BATCH_SIZE:
return
transitions = memory.sample(BATCH_SIZE)
# Transpose…

Evdok
- 1
- 1
0
votes
0 answers
DQN Stable baseline 3 doesn't start training
I created a custom environment (I checked it through the check_env() function provided by Stable baseline and it's fine). This is the code I use to start the training with DQN
from stable_baselines3 import DQN
from…

Zackbord
- 13
- 5
0
votes
1 answer
How can I improve the metrics of my DQN agent in tensorflow?
I'm working on a deep reinforcement learning project with TensorFlow and I am struggling with the training of a DQN agent of tf_agents module.
My project aims to simulate a fiscal society where there are tree possible actions: pay taxes, pay more…

Willy
- 1
- 1