Questions tagged [q-learning]

Q-learning is a model-free reinforcement learning technique.

Q-learning is a model-free, on-policy reinforcement learning technique that aims to find an action-value function that gives the expected utility (reinforcement) of taking a given action in a given state and following a fixed policy thereafter.

One of the strengths of Q-learning is that it needs only a reinforcement function to be given (i.e. a function which tells how well, or how bad the agent is performing). During the learning process, the agent needs to balance exploitation (acting greedily in terms of current action-value function) vs exploration (action randomly to discover new states or better actions then currently estimated). A common simple example for handling this issue is using an epsilon-greedy policy.

447 questions

votes

2 answers

Q Learning coefficients overflow

I've been using the blackbox challenge (www.blackboxchallenge.com) to try and learn some reinforcement learning. I've created a task and an environment for the challenge and I'm using PyBrain to train based on the black box environment. The summary…

pybrain reinforcement-learning q-learning

asked Mar 28 '16 at 14:14

user104981

votes

1 answer

Q-learning with linear function approximation

I would like to get some helpful instructions about how to use the Q-learning algorithm with function approximation. For the basic Q-learning algorithm I have found examples and I think I did understand it. In case of using function approximation I…

algorithm reinforcement-learning q-learning function-approximation

asked Mar 22 '16 at 16:40

Genesist

votes

0 answers

How do I apply Q-learning to a physical system?

We are two french mechanical engineering students interested in reinforcement learning trying to apply Q-learning to a rotary inverted pendulum for a project. We have watched David Silver's "youtube course" and read chapters of Sutton & Barto, the…

machine-learning reinforcement-learning q-learning

asked Mar 16 '16 at 16:32

user1107703

votes

1 answer

Training a pacman agent using any supervised learning algorithm

I created a simple game of pacman(no power pills) and trained it using Q Learning algorithm. Now i am thinking about training it using some supervised learning algorithm.I could create a dataset by collecting state information and then storing it…

machine-learning pacman supervised-learning q-learning

asked Aug 23 '15 at 19:01

kenway

votes

2 answers

Q learning transition matrix

I'm trying to figure out how to implement Q learning in a gridworld example. I believe I understand the basics of how Q learning works but it doesn't seem to be giving me the correct values. This example is from Sutton and Barton's book on…

machine-learning statistics q-learning

asked Apr 23 '15 at 23:26

user3425451

votes

1 answer

Java to Python Code Not Working

I am trying to convert the Java Code to Python Code and i have done it so far. Java Code works but Python Code doesn't work. Please help me. Python Code import random class QLearning(): alpha = 0.1 gamma = 0.9 state_a = 0 state_b…

python machine-learning q-learning

asked Mar 20 '14 at 16:30

ajknzhol

6,322
13
45
72

votes

1 answer

Estimate Q-Table online with a neural network

When i use Q-Table for save state-action in reinforcement learning, some state never (or rarely) happen and state-action value remain zero until max-iteration so i decide to estimate Q-Table online with a neural network instead of using…

machine-learning neural-network q-learning

asked Oct 28 '13 at 12:44

Ahmad R. Nazemi

-1

votes

0 answers

Is there anything wrong with my update policy network function or my DQN in my deep-q-learning

I want to train deep q-learning to solve a Rubik's cube given 10 possible moves (I am implementing this for an Arduino project and don't have access to every side of the cube which is why I only allowed it 10 moves). I trained this model over the…

deep-learning pytorch reinforcement-learning q-learning

asked Aug 26 '23 at 13:34

fishermanbop

-1

votes

2 answers

Robot path planning using Deep Q network

How to create an openai gym custom environment?

tensorflow keras deep-learning q-learning

asked Nov 08 '20 at 12:34

zoraiz ali

-1

votes

1 answer

OSMNX: How to get inmediate possible directions from coordinate for a Q-learning algorithm

I'm working on a Q-learning algorithm that navigates over OSMNX nodes. My goal is to offer the Q-learning agent an step based context where on each step I can list the possible actions like: "straight, turn left, turn right...". So I would need a…

python openstreetmap q-learning osmnx

asked Sep 03 '20 at 15:29

Macumbaomuerte

2,197
2
19
22

-1

votes

1 answer

should dqn state values need to be 0 to 1 only

should the values of the state in DQN need to be only 0 to 1 for example state = [0, 0, 0, 1, 1, 1, 1, 0, 1, 0] or it can have a state with values greater than 1 eh state = [6, 5, 4, 1, 1, 1, 2, 3, 15, 10]

python deep-learning reinforcement-learning q-learning dqn

asked May 25 '20 at 11:57

KKK

-1

votes

1 answer

Managing time limit in Deep Q-learning

I'm trying to implement a python's Deep RL program, where the agent has to resolve the problem (approach a target) before the expiry of the time limit. Which is the best way to manage the time? It's a good idea to pass the remaining time as an input…

time deep-learning reinforcement-learning q-learning

asked Apr 01 '20 at 19:04

Felipe

-1

votes

1 answer

C51 reinforcement learning algorithm extremely slow

I am applying reinforcement learning on a time series prediction problem. Until now, I have implemented a dueling DDQN algorithm with LSTM which seems to give some pretty good results, though sometimes slow to converge depending on the exact…

python tensorflow machine-learning reinforcement-learning q-learning

asked Aug 14 '19 at 01:28

Othmane

1,094
2
17
33

-1

votes

1 answer

How to model an article recommender as a Q-learning problem in Python

I want to implement an article recommender using Q-learning in Python. Our dataset has, for instance, four categories of articles, including health, sports, news, and lifestyle, and 10 articles for each category (40 articles in total). The idea is…

python reinforcement-learning q-learning recommendation-engine

asked Jul 12 '19 at 03:24

Superman

-1

votes

1 answer

Why do openai gym return reward zero for terminal states?

I've been experimenting with Gym (and RL) a lot lately and there is one specific behaviour of gym that has piqued my interest. Why is it that OpenAI Gym return reward 0 even when game is over? For e.g, in Breakout-v0, when all five lives are spent,…

python reinforcement-learning q-learning openai-gym

asked Mar 10 '18 at 16:26

Nilesh PS

Prev 1 2 3

…

30 Next