Questions tagged [q-learning]

Q-learning is a model-free reinforcement learning technique.

Q-learning is a model-free, on-policy reinforcement learning technique that aims to find an action-value function that gives the expected utility (reinforcement) of taking a given action in a given state and following a fixed policy thereafter.

One of the strengths of Q-learning is that it needs only a reinforcement function to be given (i.e. a function which tells how well, or how bad the agent is performing). During the learning process, the agent needs to balance exploitation (acting greedily in terms of current action-value function) vs exploration (action randomly to discover new states or better actions then currently estimated). A common simple example for handling this issue is using an epsilon-greedy policy.

447 questions
0
votes
2 answers

Q Learning coefficients overflow

I've been using the blackbox challenge (www.blackboxchallenge.com) to try and learn some reinforcement learning. I've created a task and an environment for the challenge and I'm using PyBrain to train based on the black box environment. The summary…
0
votes
1 answer

Q-learning with linear function approximation

I would like to get some helpful instructions about how to use the Q-learning algorithm with function approximation. For the basic Q-learning algorithm I have found examples and I think I did understand it. In case of using function approximation I…
0
votes
0 answers

How do I apply Q-learning to a physical system?

We are two french mechanical engineering students interested in reinforcement learning trying to apply Q-learning to a rotary inverted pendulum for a project. We have watched David Silver's "youtube course" and read chapters of Sutton & Barto, the…
0
votes
1 answer

Training a pacman agent using any supervised learning algorithm

I created a simple game of pacman(no power pills) and trained it using Q Learning algorithm. Now i am thinking about training it using some supervised learning algorithm.I could create a dataset by collecting state information and then storing it…
kenway
  • 295
  • 2
  • 4
  • 10
0
votes
2 answers

Q learning transition matrix

I'm trying to figure out how to implement Q learning in a gridworld example. I believe I understand the basics of how Q learning works but it doesn't seem to be giving me the correct values. This example is from Sutton and Barton's book on…
user3425451
  • 25
  • 1
  • 7
0
votes
1 answer

Java to Python Code Not Working

I am trying to convert the Java Code to Python Code and i have done it so far. Java Code works but Python Code doesn't work. Please help me. Python Code import random class QLearning(): alpha = 0.1 gamma = 0.9 state_a = 0 state_b…
ajknzhol
  • 6,322
  • 13
  • 45
  • 72
0
votes
1 answer

Estimate Q-Table online with a neural network

When i use Q-Table for save state-action in reinforcement learning, some state never (or rarely) happen and state-action value remain zero until max-iteration so i decide to estimate Q-Table online with a neural network instead of using…
Ahmad R. Nazemi
  • 775
  • 10
  • 26
-1
votes
0 answers

Is there anything wrong with my update policy network function or my DQN in my deep-q-learning

I want to train deep q-learning to solve a Rubik's cube given 10 possible moves (I am implementing this for an Arduino project and don't have access to every side of the cube which is why I only allowed it 10 moves). I trained this model over the…
-1
votes
2 answers

Robot path planning using Deep Q network

How to create an openai gym custom environment?
zoraiz ali
  • 77
  • 5
-1
votes
1 answer

OSMNX: How to get inmediate possible directions from coordinate for a Q-learning algorithm

I'm working on a Q-learning algorithm that navigates over OSMNX nodes. My goal is to offer the Q-learning agent an step based context where on each step I can list the possible actions like: "straight, turn left, turn right...". So I would need a…
Macumbaomuerte
  • 2,197
  • 2
  • 19
  • 22
-1
votes
1 answer

should dqn state values need to be 0 to 1 only

should the values of the state in DQN need to be only 0 to 1 for example state = [0, 0, 0, 1, 1, 1, 1, 0, 1, 0] or it can have a state with values greater than 1 eh state = [6, 5, 4, 1, 1, 1, 2, 3, 15, 10]
KKK
  • 507
  • 3
  • 12
-1
votes
1 answer

Managing time limit in Deep Q-learning

I'm trying to implement a python's Deep RL program, where the agent has to resolve the problem (approach a target) before the expiry of the time limit. Which is the best way to manage the time? It's a good idea to pass the remaining time as an input…
-1
votes
1 answer

C51 reinforcement learning algorithm extremely slow

I am applying reinforcement learning on a time series prediction problem. Until now, I have implemented a dueling DDQN algorithm with LSTM which seems to give some pretty good results, though sometimes slow to converge depending on the exact…
-1
votes
1 answer

How to model an article recommender as a Q-learning problem in Python

I want to implement an article recommender using Q-learning in Python. Our dataset has, for instance, four categories of articles, including health, sports, news, and lifestyle, and 10 articles for each category (40 articles in total). The idea is…
-1
votes
1 answer

Why do openai gym return reward zero for terminal states?

I've been experimenting with Gym (and RL) a lot lately and there is one specific behaviour of gym that has piqued my interest. Why is it that OpenAI Gym return reward 0 even when game is over? For e.g, in Breakout-v0, when all five lives are spent,…
Nilesh PS
  • 356
  • 3
  • 8
1 2 3
29
30