Questions tagged [q-learning]

Q-learning is a model-free reinforcement learning technique.

Q-learning is a model-free, on-policy reinforcement learning technique that aims to find an action-value function that gives the expected utility (reinforcement) of taking a given action in a given state and following a fixed policy thereafter.

One of the strengths of Q-learning is that it needs only a reinforcement function to be given (i.e. a function which tells how well, or how bad the agent is performing). During the learning process, the agent needs to balance exploitation (acting greedily in terms of current action-value function) vs exploration (action randomly to discover new states or better actions then currently estimated). A common simple example for handling this issue is using an epsilon-greedy policy.

447 questions

votes

0 answers

Use q-learning method to solve knapsack problem

The question is:Sugar 1 gram for 1 dollar,cookie 7 gram for 5 dollars and ice 12 gram for 10 dollars.Now i have 29 dollars,how to buy will be the heaviest? I have found the code on the Internet, but I don’t know how to modify it to solve my…

python knapsack-problem q-learning

asked Jun 24 '20 at 16:45

Lucas

votes

1 answer

Is it okay to remove most oldest experiences of DQN

I have created a DQN with a max memory size of 100000. I have a function that removes the oldest element in the memory if its size is greater than the max size. When I ran it doing 200 episodes, I noticed that the memory was already full at the…

deep-learning reinforcement-learning q-learning dqn

asked May 27 '20 at 09:06

KKK

votes

1 answer

Why does the score (accumulated reward) goes down during the exploitation phase in this Deep Q-Learning model?

I'm having a hard time trying to make a Deep Q-Learning agent find the optimal policy. This is how my current model looks like in TensorFlow: model = Sequential() model.add(Dense(units=32, activation="relu",…

python tensorflow deep-learning neural-network q-learning

asked May 26 '20 at 11:17

CSR95

votes

1 answer

q-agent is really broken, can't decide between a reward of 0 and -1

I was using a dqn for something; it wasn't working. I simplified the problem so that there are 2 actions: 0 and 1. Each action corresponds to a single reward: 0 or -1. Still, my q agent is consistently confused, giving the two actions wild values in…

python tensorflow machine-learning reinforcement-learning q-learning

asked May 25 '20 at 12:20

RichKat

votes

1 answer

q agent is learning not to take any actions

I'm training a deep q network to trade stocks; it has two possible actions; 0 : wait, 1 : buy stock if one isn't bought, sell one if one is bought. It gets, as input, the value of the stock it bought, the current value of the stock and the values of…

python neural-network q-learning dqn

asked May 24 '20 at 15:23

RichKat

votes

1 answer

Reinforcement learning, Q-learning to determine order to cast spells optimally?

If I have a wizard who has 20 spells, each of which does different things, sometimes direct damage, sometimes disabling, sometimes protecting etc. He has a fight with 10 orcs and I want to determine an optimal order of spell casting to kill the…

artificial-intelligence reinforcement-learning q-learning

asked May 19 '20 at 14:15

Greg C

votes

2 answers

What's wrong with Dyna-Q ? (Dyna-Q vs Q-learning)

I implemented the Q-learning algorithm and used it on FrozenLake-v0 on OpenAI gym. I am getting 185 total rewards during training and 7333 total rewards during testing in 10000 episodes. Is this good ? Also I tried the Dyna-Q algorithm. But it is…

python reinforcement-learning q-learning

asked May 14 '20 at 07:48

Adesh Gautam

votes

0 answers

deep q learning: why use the same net for the target net and predict net can result in instability?

For deep q learning I can kind of imagine the neural net as the q table for normal q learning. So if for the q learning the q table is updated simultaneously, why cannot we use the same net for target q net and predict q net? I searched on google…

machine-learning deep-learning q-learning

asked May 11 '20 at 20:35

J.R.

votes

0 answers

How can I interpolate missing Reward-matrix entries (Q-learning)?

I have a simple game on a grid. 25 states, five actions per state (left, right, up, down, stay). There might be special rules for edges and corners, but these won't matter here. My reward matrix (below) is pretty sparse, but this is all the data I…

python machine-learning q-learning

asked May 10 '20 at 17:14

Shay

1,368
11
17

votes

0 answers

Deep Q Learning : How to visualize convergence?

I have trained an RL agent in an environment similar to the Puckworld. Theres no puck though! The agent is in continuous space and wants to reach a fixed target. Each episode the agent is born at a random location and there is an added noise to each…

deep-learning neural-network pytorch reinforcement-learning q-learning

asked May 09 '20 at 13:42

Ravi Pradip

votes

1 answer

How do I set up a state space for q-learning?

This is apparently very obvious and basic, because I can't find any tutorials on it, but how do I set up a state space for a q-learning environment? If I understand correctly, every state needs to be associated with a single value, right? If so,…

machine-learning q-learning state-space

asked May 09 '20 at 10:19

RichKat

votes

1 answer

How many states could I work with on my ordinary home computer when using Q-learning?

How many states could I work with on my ordinary home computer when I want to implement a reinforcement learning algorithm such as Q-Learning? 1 thousand, 1 million, more?

machine-learning reinforcement-learning q-learning

asked Apr 15 '20 at 14:31

MMM

votes

1 answer

Does the training lost diagram showing over-fitting? Deep Q-learning

below diagram is the training loss values against epoch. Based on the diagram, does it mean I have make it over-fitting? If not, what is causing the spike in loss values along the epoch? In overall, it can be observed that the loss value is in…

tensorflow reinforcement-learning q-learning

asked Mar 31 '20 at 15:35

Yeo Keat

votes

1 answer

DQN Model ValueError: setting an array element with a sequence

(All references to code can be found at https://github.com/EXJUSTICE/Doom_DQN_GC/blob/master/TF2_Doom_GC_CNN.ipynb) Background I apologize for the length of this post, I wanted it to be as clear as possible. I've been adapting some Atari OpenAI gym…

python tensorflow reinforcement-learning openai-gym q-learning

asked Mar 29 '20 at 11:23

Y. Xu

votes

0 answers

Is there any method in reinforcement Learning to select multiple simultaneous actions?

I'm working on a research project that involves the application of reinforcement learning to planning and decision-making problems. Typically, these problems involve picking (sampling) multiple actions within a state based on ranking [max_q to…

pytorch reinforcement-learning q-learning dqn

asked Mar 14 '20 at 09:16

M.zubair Islam

Prev 1 2 3

…

29 30 Next