Questions tagged [q-learning]

Q-learning is a model-free reinforcement learning technique.

Q-learning is a model-free, on-policy reinforcement learning technique that aims to find an action-value function that gives the expected utility (reinforcement) of taking a given action in a given state and following a fixed policy thereafter.

One of the strengths of Q-learning is that it needs only a reinforcement function to be given (i.e. a function which tells how well, or how bad the agent is performing). During the learning process, the agent needs to balance exploitation (acting greedily in terms of current action-value function) vs exploration (action randomly to discover new states or better actions then currently estimated). A common simple example for handling this issue is using an epsilon-greedy policy.

447 questions
0
votes
1 answer

OpenAI gym breakout-ram-v4 unable to learn

I am using Q learning and the program should be able to play the game after some tries but it is not learning even when the epsilon value if 0.1. I have tried changing the batch size the memory size. I have changed the code to give -1 reward if the…
0
votes
1 answer

Weird results when playing with DQN with targets

I've been trying to implement DQN with a target network and I'm getting some really weird results. When I try to train my DQN from scratch on Cartpole, it doesn't seem to learn and loss increases in an exponential fashion. However, if I load in a…
Alex
  • 159
  • 3
  • 16
0
votes
1 answer

Bounding Box Refinement using Reinforcement Learning

I have a model which detects an object and makes a bounding box over it. The problem is that those bounding boxes are not accurate and need to be a little more tight on the object rather than some body parts exceeding the box or some boxes bigger…
0
votes
1 answer

Performance Comparison between DoubleDQN & DQN

I tried DoubleDQN and DQN algorithm on gym NChain game and realized that the performance of DoubleDQN was not more stable or better than DQN. I set batch size of the training after each action taken to be 1. May I know this is the reason of…
CA Hau
  • 1
0
votes
1 answer

Is MaxQ' sum of all possible rewards or highest possible reward?

I'm coding a simple q-learning example and to update q-values you need a maxQ'. I'm not sure if maxQ' is referring to the sum of all possible rewards or the highest possible reward:
user11105005
0
votes
1 answer

Is it possible to train a neural network with "splited" output

Is it possible to consider the output of one neural network as two or more sets of outputs ? I explain myself a bit more (in a q learning context): Imagine i have two agents in the same environement and each agents have a different amount of…
0
votes
0 answers

beautify an image with reinforcement learning

I am trying to formulate and solve the following problem of image mutation. Suppose I am trying to insert an object image into a "background" image of several objects, and I will need to look for a "sweet spot" to insert the image: I am tentatively…
0
votes
1 answer

How can I take actions and states when my transition between states depends on multiple actions simultaneously?

I have a model whose states depend on multiple actions; I can take a single parameter as action, but what if the state transition depends on more than one action?
0
votes
1 answer

How to insert R-Table from (15, 15) to (255 states, 4 actions)

I'm setting up a R-table with (255 states, 4 actions). How do I input it from R-table (15, 15)? I have created R-table (15, 15), but turn out I have to make R-table (225, 4) for the homework. r_matrix = np.array([ [-1, -2, -3, -2, -3, -3, -4, -1,…
Try
  • 41
  • 2
  • 9
0
votes
2 answers

Using a Q-learning model without external libraries

I am trying to use reinforcement learning on a Pacman based game. I want to use Q-learning techniques to generate my agent's actions. I was planning on using openai-gym and keras libraries to train my model, but I was hoping there was a way to save…
0
votes
1 answer

Display loss in a Tensorflow DQN without leaving tf.Session()

I have a DQN all set up and working, but I can't figure out how to display the loss without leaving the Tensorflow session. I first thought it involved creating a new function or class, but I'm not sure where to put it in the code, and what…
Rayna Levy
  • 73
  • 1
  • 1
  • 7
0
votes
1 answer

Teach robot to collect items in grid world before reach terminal state by using reinforcement learning

My problem is the following. I have a simple grid world: https://i.stack.imgur.com/xrhJw.png The agent starts at the initial state labeled with START, and the goal is to reach the terminal state labeled with END. But, the agent has to avoid the…
0
votes
1 answer

Loss decreased and jump suddenly

I am training an agent with DQN. The reward is increasing and the loss is decreasing. It is a good sign I have great results. However, I have a little doubt because the loss decreased and suddenly jump to a very high value Here is the first 20…
fgauth
  • 143
  • 1
  • 8
0
votes
0 answers

How to obtain a single output from a CNN while we feed it multiple number of colour images?

I am doing a Deep-Q Learning task, and I have a sequence of 4 images that I have defined as a state. Now I want to feed these 4 images in a CNN and obtain the softmax of the outputs as to what action to take then. So how do I do this? Cause 4…
0
votes
1 answer

What is the code of shooting bullets to dynamic objects in Python?

I want to train an AI using Reinforcement Learning in python. The goal is that AI should be able to shoot moving balls come to the game env. randomly at different speeds and from different positions. The AI (player) position is fixed and it can only…
Farbod.T
  • 41
  • 7