Questions tagged [q-learning]

Q-learning is a model-free reinforcement learning technique.

Q-learning is a model-free, on-policy reinforcement learning technique that aims to find an action-value function that gives the expected utility (reinforcement) of taking a given action in a given state and following a fixed policy thereafter.

One of the strengths of Q-learning is that it needs only a reinforcement function to be given (i.e. a function which tells how well, or how bad the agent is performing). During the learning process, the agent needs to balance exploitation (acting greedily in terms of current action-value function) vs exploration (action randomly to discover new states or better actions then currently estimated). A common simple example for handling this issue is using an epsilon-greedy policy.

447 questions

votes

0 answers

How to take off objects from map when using Q-learning with OpenAI-Gym in Python

I'm trying to learn how to use Q-learning with OpenAI-Gym in Python, and I modified existing gym 'FrozenLake-v0' to make an example, where agent is going through the map of labirynth and picks up apples - there is a reward for every picked apple.…

python openai-gym q-learning

asked Mar 06 '21 at 19:54

Łukasz Patuła

votes

3 answers

what does "IndexError: index 20 is out of bounds for axis 1 with size 20"

I was working on q learning in a maze environment, However, at the initial stage, it was working fine but afterward, I was getting the following max_future_q = np.max(q_table[new_discrete_state]) IndexError: index 20 is out of bounds for axis 1 with…

python reinforcement-learning maze q-learning

asked Feb 21 '21 at 20:04

Sherin shibu

votes

1 answer

Are Q-learning agents required to converge towards actual state-action values?

It is my understanding that Q-learning attempts to find the actual state-action values for all states and actions. However, my hypothetical example below seems to indicate that this is not necessarily the case. Imagine a Markov decision process…

reinforcement-learning q-learning

asked Feb 15 '21 at 16:45

DarkZero

votes

1 answer

IronPython not returning dictionary keys as expected

I am trying to create a q-table as a dictionary filled with random values in grasshopper (a parametric design tool that uses IronPython as interpreter). When I enter the code as shown in image1, I receive a dictionary as shown in image 2. Keys are…

python dictionary ironpython q-learning grasshopper

asked Jan 25 '21 at 15:59

SemPoAg

votes

1 answer

Why is the pacman game pausing automatically for few seconds and then again running?

I was trying the Pacman game with Q-Learning(Reinforcement learning) in java. However, I could see the game was pausing automatically for a few seconds and then again running. I just wanted to know the reason for this. Youtube Video…

java reinforcement-learning q-learning

asked Jan 11 '21 at 12:48

iamarkaj

votes

1 answer

Bellman equation

In Bellman equation where, s = a particular state (room) a = action (moving between the rooms) s′ = state to which the robot goes from s = discount factor R(s, a) = a reward function which takes a state s and action a and outputs a reward…

machine-learning q-learning

asked Dec 08 '20 at 12:01

TinyCoder

votes

1 answer

ValueError: cannot reshape array of size 1 into shape (1,4)

Commenting out the offending code also gives me this error: AssertionError: Cannot call env.step() before calling reset() Trying to follow along a tutorial on openai gym. Getting a numpy error when reshaping the state of my environment. Both of…

python jupyter-notebook artificial-intelligence openai-gym q-learning

asked Dec 07 '20 at 01:16

n3vdawg

votes

0 answers

reinforcement learning - how to use a q learning algorithm for a reinforce.jl environment?

I've created this MDP environment using reinforce.jl. It's supposed to mimic the cake eating problem, or consumption-savings problem. I wanna use a q learning algorithm to find the optimal policy. However, reinforce.jl package only has sarsa policy…

julia reinforcement-learning q-learning

asked Nov 30 '20 at 06:58

dannyfromwow

votes

0 answers

QLearning network in a custom environment is choosing the same action every time, despite the heavy negative reward

So I plugged QLearningDiscreteDense into a dots and boxes game I made. I created a custom MDP environment for it. The problem is that it chooses action 0 each time, the first time it works but then it's not an available action anymore so it's an…

java machine-learning neural-network q-learning deeplearning4j

asked Nov 29 '20 at 22:26

J B

votes

1 answer

Multiagent (not deep) reinforcement learning? Modeling the problem

I have N number of agents/users accessing a single wireless channel and at each time, only one agent can access the channel and receive a reward. Each user has a buffer that can store B number of packets and I assume it as infinite buffer. Each user…

reinforcement-learning q-learning multi-agent dqn

asked Nov 21 '20 at 13:48

M. Awais Jadoon

votes

1 answer

Custom loss function for Deep Q-Learning

The following problem has occurred while tackling a reinforcement learning problem. In my code I eventually get to the following problem, when calculating the loss: My neural network outputs 4 q-values (given a state as input, it outputs the q-value…

keras deep-learning reinforcement-learning q-learning

asked Nov 17 '20 at 13:05

Peter

votes

1 answer

incompatible array types are mixed in the forward input (LinearFunction) in machine learning

I have trained a deep Q-Learning model using Chanier: class Q_Network (chainer.Chain): def __init__(self, input_size, hidden_size, output_size): super (Q_Network, self).__init__ ( fc1=L.Linear (input_size,…

pandas machine-learning deep-learning q-learning chainer

asked Nov 11 '20 at 22:16

William

3,724
9
43
76

votes

1 answer

Deep Q-Learning for grid world

Has anyone implemented the Deep Q-learning to solve a grid world problem where state is the [x, y] coordinates of the player and goal is to reach a certain coordinate [A, B]. Reward setting could be -1 for each step and +10 for reaching [A,B]. [A,…

reinforcement-learning dql q-learning dqn

asked Sep 20 '20 at 00:37

corvo

votes

1 answer

What decides epsilon decay value in reinforcement learning?

I've been learning Q learning from the youtube lecture below https://www.youtube.com/watch?v=Gq1Azv_B4-4&list=PLlMOxjd7OfgNxJSgF8pAs3_qMion-X1QI&index=2 In this tutorial, the guy uses epsilon methodology like this(I cut the details out) import…

reinforcement-learning openai-gym q-learning

asked Aug 02 '20 at 05:28

Baaam Park

votes

1 answer

How to set coordinates as a state space (range) for use in Q-table?

Suppose I have a class Player that i want to use as my agent.I want all the coordinates possible in my environment to be my state space In my environment, I want to use the coordinates of the player as my state.How should I go about setting my…

python machine-learning coordinates q-learning state-space

asked Jun 26 '20 at 13:43

Cravan PZ

Prev 1 2 3

…

29 30 Next