Questions tagged [q-learning]

Q-learning is a model-free reinforcement learning technique.

Q-learning is a model-free, on-policy reinforcement learning technique that aims to find an action-value function that gives the expected utility (reinforcement) of taking a given action in a given state and following a fixed policy thereafter.

One of the strengths of Q-learning is that it needs only a reinforcement function to be given (i.e. a function which tells how well, or how bad the agent is performing). During the learning process, the agent needs to balance exploitation (acting greedily in terms of current action-value function) vs exploration (action randomly to discover new states or better actions then currently estimated). A common simple example for handling this issue is using an epsilon-greedy policy.

447 questions

votes

2 answers

Q-learning in game not working as expected

I have attempted to implement Q-learning in to a simple game I have written. The game is based around the player having to "jump" to avoid oncoming boxes. I have designed the system with two actions; jump and do_nothing and the states are the…

artificial-intelligence game-ai q-learning

asked Jun 15 '15 at 09:12

Jack Wilsdon

6,706
11
44
87

votes

2 answers

Q Learning Algorithm for Tic Tac Toe

I could not understand how to update Q values for tic tac toe game. I read all about that but I could not imagine how to do this. I read that Q value is updated end of the game, but I haven't understand that if there is Q value for each action ?

machine-learning artificial-intelligence tic-tac-toe reinforcement-learning q-learning

asked Jan 19 '15 at 09:47

bzkrtmurat

votes

2 answers

Q Learning Applied To a Two Player Game

I am trying to implement a Q Learning agent to learn an optimal policy for playing against a random agent in a game of Tic Tac Toe. I have created a plan that I believe will work. There is just one part that I cannot get my head around. And this…

python tic-tac-toe reinforcement-learning q-learning

asked Mar 23 '18 at 13:59

Frederick

votes

1 answer

Questions about Q-Learning using Neural Networks

I have implemented Q-Learning as described in, http://web.cs.swarthmore.edu/~meeden/cs81/s12/papers/MarkStevePaper.pdf In order to approx. Q(S,A) I use a neural network structure like the following, Activation sigmoid Inputs, number of inputs + 1…

machine-learning artificial-intelligence neural-network reinforcement-learning q-learning

asked Dec 07 '14 at 08:27

Hamza Yerlikaya

49,047
44
147
241

votes

2 answers

Criteria for convergence in Q-learning

I am experimenting with the Q-learning algorithm. I have read from different sources and understood the algorithm, however, there seem to be no clear convergence criteria that is mathematically backed. Most sources recommend iterating several times…

algorithm machine-learning artificial-intelligence reinforcement-learning q-learning

asked Jan 13 '20 at 01:51

drtamakloe

votes

0 answers

Why does my agent always takes a same action in DQN - Reinforcement Learning

I have trained an RL agent using DQN algorithm. After 20000 episodes my rewards are converged. Now when I test this agent, the agent is always taking the same action , irrespective of state. I find this very weird. Can someone help me with this. Is…

reinforcement-learning q-learning policy-gradient-descent

asked Oct 09 '19 at 04:35

chink

1,505
3
28
70

votes

2 answers

Deep Q Network is not learning

I tried to code a Deep Q Network to play Atari games using Tensorflow and OpenAI's Gym. Here's my code: import tensorflow as tf import gym import numpy as np import os env_name = 'Breakout-v0' env = gym.make(env_name) num_episodes = 100 input_data…

tensorflow neural-network artificial-intelligence reinforcement-learning q-learning

asked Apr 15 '18 at 10:27

Kay Jersch

votes

2 answers

RL Activation Functions with Negative Rewards

I have a question regarding appropriate activation functions with environments that have both positive and negative rewards. In reinforcement learning, our output, I believe, should be the expected reward for all possible actions. Since some options…

machine-learning reinforcement-learning q-learning activation-function

asked Dec 26 '17 at 14:35

ZAR

2,550
4
36
66

votes

1 answer

list index out of range error using random.choice

I'm getting the error below when I run my program, which has the function defined below in it. I think it's the valid_actions = filter(lambda x: x != random.choice(maxQactions) part that's causing the error. Does anyone see what the issue is, or…

python-2.7 q-learning

asked Sep 28 '17 at 05:05

user3476463

3,967
22
57
117

votes

1 answer

How to implement q-learning in R?

I am learning about q-learning and found a Wikipedia post and this website. According to the tutorials and pseudo code I wrote this much in R #q-learning…

r q-learning

asked Sep 06 '16 at 16:17

Eka

14,170
38
128
212

votes

1 answer

DQN Pytorch Loss keeps increasing

I am implementing simple DQN algorithm using pytorch, to solve the CartPole environment from gym. I have been debugging for a while now, and I cant figure out why the model is not learning. Observations: using SmoothL1Loss performs worse than…

python machine-learning pytorch reinforcement-learning q-learning

asked Jun 01 '21 at 12:47

Virus

votes

1 answer

Something wrong with Keras code Q-learning OpenAI gym FrozenLake

Maybe my question will seem stupid. I'm studying the Q-learning algorithm. In order to better understand it, I'm trying to remake the Tenzorflow code of this FrozenLake example into the Keras code. My code: import gym import numpy as np import…

python tensorflow artificial-intelligence keras q-learning

asked Aug 24 '17 at 19:57

Max Titkov

votes

3 answers

Learning rate of a Q learning agent

The question how the learning rate influences the convergence rate and convergence itself. If the learning rate is constant, will Q function converge to the optimal on or learning rate should necessarily decay to guarantee convergence?

machine-learning reinforcement-learning q-learning

asked Oct 08 '15 at 09:31

uduck

votes

1 answer

Implementing reinforcement learning in NetLogo (Learning in multi-agent models)

I am thinking to implement a learning strategy for different types of agents in my model. To be honest, I still do not know what kind of questions should I ask first or where to start. I have two types of agents which I want them to learn by…

netlogo reinforcement-learning agent-based-modeling q-learning

asked Jan 15 '14 at 12:32

Marzy

1,884
16
24

votes

3 answers

Unbounded increase in Q-Value, consequence of recurrent reward after repeating the same action in Q-Learning

I'm in the process of development of a simple Q-Learning implementation over a trivial application, but there's something that keeps puzzling me. Let's consider the standard formulation of Q-Learning Q(S, A) = Q(S, A) + alpha * [R + MaxQ(S', A') -…

machine-learning artificial-intelligence reinforcement-learning q-learning

asked Oct 30 '12 at 23:11

devoured elysium

101,373
131
340
557

Prev 1

…

29 30 Next