Questions tagged [q-learning]

Q-learning is a model-free reinforcement learning technique.

Q-learning is a model-free, on-policy reinforcement learning technique that aims to find an action-value function that gives the expected utility (reinforcement) of taking a given action in a given state and following a fixed policy thereafter.

One of the strengths of Q-learning is that it needs only a reinforcement function to be given (i.e. a function which tells how well, or how bad the agent is performing). During the learning process, the agent needs to balance exploitation (acting greedily in terms of current action-value function) vs exploration (action randomly to discover new states or better actions then currently estimated). A common simple example for handling this issue is using an epsilon-greedy policy.

447 questions

votes

1 answer

Q learning for ludo game?

I am at the moment trying to implement a AI player using Q-learning to play against 2 different random players.. I am not sure Q-learning is applicable for a ludo game, which why I am being bit doubtful about it.. I have for the game defined 11…

c++ q-learning

asked May 19 '16 at 14:46

Lamda

votes

1 answer

Grid World representation for a neural network

I'm trying to come up with a better representation for the state of a 2-d grid world for a Q-learning algorithm which utilizes a neural network for the Q-function. In the tutorial, Q-learning with Neural Networks, the grid is represented as a 3-d…

neural-network reinforcement-learning q-learning

asked Apr 25 '16 at 20:20

Galen

votes

1 answer

Adding constraints in Q-learning and assigning rewards if constraints are violated

I took an RL course recently and I am writing a Q-learning controller for a power management application where I have continuous states and discrete actions. I am using a neural network (Q-network) for approximation the action values and selecting…

machine-learning artificial-intelligence dynamic-programming reinforcement-learning q-learning

asked Apr 15 '16 at 13:49

wannabe_nerd

votes

1 answer

Tensorflow implementation of loss of Q-network with slicing

I'm implementing a Q-network as described in Human-level control through deep reinforcement learning (Mnih et al. 2015) in TensorFlow. To approximate the Q-function they use a neural network. The Q-function maps a state and an action to a scalar…

python neural-network tensorflow reinforcement-learning q-learning

asked Jan 21 '16 at 16:34

Skeppet

votes

1 answer

Deep Neural Network combined with qlearning

I'm using joint positions from a Kinect camera as my state space but I think it's going to be too large (25 joints x 30 per second) to just feed into SARSA or Qlearning. Right now I'm using the Kinect Gesture Builder program which uses Supervised…

deep-learning reinforcement-learning accord.net q-learning sarsa

asked Dec 12 '15 at 23:00

ORobotics

votes

1 answer

Difference between batch q learning and growing batch q learning

I am confused about the difference between batch and growing batch q learning. Also, if I only have historical data, can I implement growing batch q learning? Thank you!

reinforcement-learning q-learning

asked Sep 28 '15 at 14:10

ChiefsCreation

votes

4 answers

Q learning: Relearning after changing the environment

I have implemented Q learning on a grid of size (n x n) with a single reward of 100 in the middle. The agent learns for 1000 epochs to reach the goal by the following agency: He chooses with probability 0.8 the move with the highest…

algorithm machine-learning artificial-intelligence reinforcement-learning q-learning

asked Dec 30 '14 at 18:57

AlexConfused

votes

1 answer

Is Q-Learning Algorithm's implementation recursive?

I am trying to implement the Q-Learning. The general algorithm from here is as below In the statement I just don't get it that should i implement the above statement of the original pseudo-code recursively for all next states which current…

algorithm recursion reinforcement-learning q-learning

asked Dec 04 '14 at 11:44

dariush

3,191
3
24
43

votes

4 answers

is Q-learning without a final state even possible?

I have to solve this problem with Q-learning. Well, actually I have to evaluated a Q-learning based policy on it. I am a tourist manager. I have n hotels, each can contain a different number of persons. for each person I put in a hotel I get a…

machine-learning reinforcement-learning q-learning

asked Apr 19 '14 at 16:03

user1834153

votes

2 answers

Qlearning - Defining states and rewards

I need some help with solving a problem that uses the Q-learning algorithm. Problem description: I have a rocket simulator where the rocket is taking random paths and also crashes sometimes. The rocket has 3 different engines that can be either on…

machine-learning reinforcement-learning q-learning reward

asked Jun 11 '13 at 17:00

mrjasmin

1,230
6
21
37

votes

1 answer

Can you limit the number of actions when using q learning?

I am currently implementing q learning to solve a maze which contains fires which initiate randomly. Would it be considered proper for me to code the action to not be an option for the agent if there is a fire in that direction or should my reward…

machine-learning reinforcement-learning q-learning

asked May 26 '22 at 02:11

Shabir

votes

2 answers

Q-table representation for nested lists as states and tuples as actions

How can I create a Q-table, when my states are lists and actions are tuples? Example of states for N = 3 [[1], [2], [3]] [[1], [2, 3]] [[1], [3, 2]] [[2], [3, 1]] [[1, 2, 3]] Example of actions for those states [[1], [2], [3]] -> (1, 2), (1, 3),…

python numpy reinforcement-learning q-learning

asked Apr 04 '22 at 20:06

John Doe

votes

0 answers

Why is my DQN (Deep Q Network) not learning?

I am training a DQN (Deep Q Network) on a CartPole problem from OpenAI's gym, but when I start the training, the total score from an episode decreases, instead of increasing. I don't know if it is helpful but I noticed that the AI prefers one action…

python machine-learning pytorch reinforcement-learning q-learning

asked Jun 29 '21 at 18:31

Vladislav Korecký

votes

2 answers

How to Learn the Reward Function in a Markov Decision Process

What's the appropriate way to update your R(s) function during Q-learning? For example, say an agent visits state s1 five times, and receives rewards [0,0,1,1,0]. Should I calculate the mean reward, e.g. R(s1) = sum([0,0,1,1,0])/5? Or should I use a…

machine-learning reinforcement-learning q-learning

asked Jul 17 '11 at 19:01

Cerin

60,957
96
316
522

votes

1 answer

Target values to train against in Deep Q Network

I understand the whole gist of Q-learning and its update equation: Q(s, a) = r + \gamma * max_a' (Q(s', a')) where %s% is the current state, a is the action taken, r is the reward, s' is the next state as a result of the action, and we maximize…

python deep-learning reinforcement-learning q-learning

asked Jan 14 '21 at 02:40

Rangumi

Prev 1 2 3

…

29 30 Next