Questions tagged [q-learning]

Q-learning is a model-free reinforcement learning technique.

Q-learning is a model-free, on-policy reinforcement learning technique that aims to find an action-value function that gives the expected utility (reinforcement) of taking a given action in a given state and following a fixed policy thereafter.

One of the strengths of Q-learning is that it needs only a reinforcement function to be given (i.e. a function which tells how well, or how bad the agent is performing). During the learning process, the agent needs to balance exploitation (acting greedily in terms of current action-value function) vs exploration (action randomly to discover new states or better actions then currently estimated). A common simple example for handling this issue is using an epsilon-greedy policy.

447 questions
0
votes
3 answers

Why Q-Learning is Off-Policy Learning?

Hello Stack Overflow Community! Currently, I am following the Reinforcement Learning lectures of David Silver and really confused at some point in his "Model-Free Control" slide. In the slides, Q-Learning is considered as off-policy learning. I…
test
  • 93
  • 1
  • 11
0
votes
1 answer

ModuleNotFoundError: No module named 'std_msgs' - Gazebo installation

I am trying to install gym_gazebo on my Ubuntu 16.04 LTS system according to https://github.com/erlerobot/gym-gazebo Everything is getting installed correctly, however, while trying to run python circuit2_turtlebot_lidar_qlearn.py , I get error as…
Pallavi
  • 548
  • 2
  • 9
  • 18
0
votes
1 answer

Q learning algorithm for robot where next state is not defined

I am new to machine learning and i developing a robot which environment is dynamic. I am using python as the programming language for my project. I have a goal state and robot has four actions such as forward, backward, turn right and turn left. The…
0
votes
2 answers

Tensorflow reinforcement Learning Model will barely ever make a decision on its own and will not learn.

I am trying to create a reinforcement learning agent that can buy, sell or hold stock positions. The issue I'm having is that even after over 2000 episodes, the agent still can not learn when to buy, sell or hold. Here is an image from the 2100th…
0
votes
1 answer

Defining states, Q and R matrix in reinforcement learning

I am new to RL and I am referring couple of books and tutorials, yet I have a basic question and I hope to find that fundamental answer here. the primary book referred: Sutton & Barto 2nd edition and a blog Problem description (only Q learning…
0
votes
2 answers

How to choose the reward function for the cart-pole inverted pendulum task

I am new in python or any programming language for that matter. For months now I have been working on stabilising the inverted pendulum. I have gotten everything working but struggling to get the right reward function. So far, after researching and…
0
votes
1 answer

Python - high disk usage in SumTree

I've encountered some weird behaviour of my python program. Basically when I tried to create adn fill a SumTree of length larger than 1000, my disk usage increases a lot to ~300MB/s then the programme died. I'm pretty sure there's no file r/w…
0
votes
1 answer

How to train a neural network with Q-Learning

I just implemented Q-Learning without neural networks but I am stuck at implementing them with neural networks. I will give you a pseudo code showing how my Q-Learning is implemented: train(int iterations) buffer = empty buffer for i = 0…
Finn Eggers
  • 857
  • 8
  • 21
0
votes
1 answer

making my multi-agent environment by deep reinforcement learning

I should make my own environment and apply dqn algorithm in a multi-agent environment. I have 4 agents . Each state of my environment has 5 variables state=[p1, p2, p3, p4,p5], at each time step,we update the different parameters of all states.…
0
votes
1 answer

Q Learning w/ Galaga - Defining States

I am working on an implementation of Q-Learning to build an ai to play Galaga. I understand that Q-learning requires states and actions, and tables to determine movement between states. All the examples and tutorials for Q-Learning online seem to be…
Simon
  • 1
  • 1
0
votes
1 answer

The purpose of using Q-Learning algorithm

What is the point of using Q-Learning? I have used an example code that represents 2D board with pawn moving on this board. At the right end of the board there is goal which we want to reach. After completion of algorithm I have a q table with…
padrian92
  • 147
  • 4
  • 16
0
votes
1 answer

Experience Replay is making my agent worse

I have 'successfully' set up a Q-network for solving the 'FrozenLake-v0' env of the OpenAI gym (at least, I think.. not 100% sure how I score - I get 70 to 80 out of 100 successful episodes after 5k episodes of training without Experience Replay).…
0
votes
1 answer

Normalization of input data to Qnetwork

I am well known with that a “normal” neural network should use normalized input data so one variable does not have a bigger influence on the weights in the NN than others. But what if you have a Qnetwork where your training data and test data can…
Søren Koch
  • 145
  • 1
  • 1
  • 10
0
votes
0 answers

Simple Q-learning neural network using numpy

import numpy as np from numpy import exp, array, random, dot R = np.matrix([[-1, -1, -1, -1,1, -1], # for correct action the reward is 1 and for wrong action it's -1 [-1, -1, -1, 1, -1, 1], [-1,…
0
votes
1 answer

Calculating Q value in dqn with experience replay

consider the Deep Q-Learning algorithm 1 initialize replay memory D 2 initialize action-value function Q with random weights 3 observe initial state s 4 repeat 5 select an action a 6 with probability ε select a random…