Questions tagged [q-learning]

Q-learning is a model-free reinforcement learning technique.

Q-learning is a model-free, on-policy reinforcement learning technique that aims to find an action-value function that gives the expected utility (reinforcement) of taking a given action in a given state and following a fixed policy thereafter.

One of the strengths of Q-learning is that it needs only a reinforcement function to be given (i.e. a function which tells how well, or how bad the agent is performing). During the learning process, the agent needs to balance exploitation (acting greedily in terms of current action-value function) vs exploration (action randomly to discover new states or better actions then currently estimated). A common simple example for handling this issue is using an epsilon-greedy policy.

447 questions

vote

1 answer

Q-Learning optimisation with overlapping states

I am implementing Q-learning for a simple task, which involves a robot moving to a target position, in a continuous coordinate system. Each episode has a fixed length, and the rewards are sparse: there is a single reward given to the final…

machine-learning reinforcement-learning q-learning

asked Jun 20 '17 at 18:08

Karnivaurus

22,823
57
147
247

vote

2 answers

Reward function for learning to play Curve Fever game with DQN

I've made a simple version of Curve Fever also known as "Achtung Die Kurve". I want the machine to figure out how to play the game optimally. I copied and slightly modified an existing DQN from some Atari game examples that is made with Google's…

machine-learning tensorflow deep-learning reinforcement-learning q-learning

asked May 05 '17 at 11:51

Anthony De Meulemeester

vote

1 answer

Different rewards for same state in reinforcement learning

I want to implement Q-Learning for the Chrome dinosaur game (the one you can play when you are offline). I defined my state as: distance to next obstacle, speed and the size of the next obstacle. For the reward I wanted to use the number of…

machine-learning reinforcement-learning q-learning

asked Apr 15 '17 at 13:07

7Orion7

vote

0 answers

Q-Values in DQN are getting too big

I have already checked this question and confirmed this is not a duplicate issue. Problem: I have implemented an agent that uses a DQN with TensorFlow to learn the optimal policy of a game called 'dots and boxes'. The algorithm appears to actually…

python machine-learning tensorflow reinforcement-learning q-learning

asked Apr 12 '17 at 18:58

mattdeak

vote

0 answers

How should I choose Keras parameters for grid exploration?

I am trying to train a neural network to efficiently explore a grid to locate an object using Keras and Keras-RL. Every "step", the agent chooses a direction to explore by choosing a number from 0 to 8, where each corresponds to a cardinal or…

keras lstm reinforcement-learning q-learning

asked Mar 12 '17 at 15:32

Harrison Grodin

2,253
2
19
30

vote

1 answer

ϵ-greedy policy with decreasing rate of exploration

I want to implement ϵ-greedy policy action-selection policy in Q-learning. Here many people have used, following equation for decreasing rate of exploration, ɛ = e^(-En) n = the age of the agent E = exploitation parameter But I am not clear what…

machine-learning greedy reinforcement-learning q-learning

asked Feb 20 '17 at 04:18

D_Wills

vote

1 answer

Sequence with the max score?

let say I have n-states S={s1,s2,s3, ..... sn } and I have a score for every transition i.e. T-matrix f.e. s1->s5 = 0.3, s4->s3 = 0.7, ....etc. What algorithm or procedure should I use to select the best scored sequence/path starting from state-x…

algorithm reinforcement-learning q-learning

asked Jan 19 '17 at 20:27

sten

7,028
9
41
63

vote

2 answers

Why doesn't my neural network Q-learner doesn't learn tic-tac-toe

Okay, so I have created a neural network Q-learner using the same idea as DeepMind's Atari algorithm (except I give raw data not pictures (yet)). Neural network build: 9 inputs (0 for empty spot, 1 for "X", -1 for "O") 1 hidden layer with 9-50…

machine-learning neural-network deep-learning reinforcement-learning q-learning

asked Nov 30 '16 at 19:50

Dope

vote

1 answer

Pybrain reinforcement learning; dimension of state

I am working on a project to combine reinforcement learning with traffic light simulations using the package Pybrain. I have read the tutorial and implemented my own subclasses of Environment and Task. I am using an ActionValueNetwork as controller…

python neural-network pybrain reinforcement-learning q-learning

asked Nov 23 '16 at 09:32

Isabelle Tan

vote

1 answer

Why does Q-learning work in an unknown environment?

Q-learning uses instant reward matrix R to model an environment. That means it uses a known matrix R for learning, So why do people say "Q-learning can work in an unknown environment"?

terminology reinforcement-learning q-learning

asked Oct 31 '16 at 10:39

Learning

vote

1 answer

Programmaticaly find next state for max(Q(s',a')) in q-learning using R

I am writing a simple grid world q-learning program using R. This is my grid world This simple grid world has 6 states in which state 1 and state 6 are starting and ending state. I avoided adding a fire pit, wall, wind so to keep my grid world as…

r reinforcement-learning q-learning

asked Oct 07 '16 at 04:22

Eka

14,170
38
128
212

vote

1 answer

Can Q-Learning algorithm become overtrained?

It has been proved that the Q-Learning algorithm converges to the Qs of the optimal policy which are unique. So is it correct to conclude that the Q-Learning algorithm cannot become overtrained?

machine-learning reinforcement-learning q-learning

asked Sep 04 '16 at 12:34

Sahand Rezaei

vote

1 answer

Q-learning with function approximation where each state doesn't have same set of actions

I am applying Q-learning with function approximation to a problem where each state doesn't have same set of actions. There when i am calculating target Target = R(s,a,s') + (max_a' * Q(s',a')) As each state does not have same set of actions so…

reinforcement-learning q-learning

asked Aug 24 '16 at 17:34

Prabir

vote

0 answers

How can I choose the features my q-learning with linear function approximation

I am developing AI using reinforcement-learning. It is a game that player should avoid bricks falling from sky. There are 20 bricks falling to the ground. game screen shot , game play video link I implemented AI using reinforcement-learning with…

machine-learning reinforcement-learning q-learning function-approximation

asked Jul 25 '16 at 08:18

Juho Sung

vote

1 answer

How do you normalize weights q-learning with linear function approximation

I am developing simple game program to show q-learning with linear function approximation. screen shot In this game, there are uncountable state. I have to consider many factors like player's position, speed, and enemy's position (there are 12 ~ 15…

javascript machine-learning q-learning

asked Jun 29 '16 at 09:03

Juho Sung

Prev 1 2 3

…

29 30 Next