Questions tagged [q-learning]

Q-learning is a model-free reinforcement learning technique.

Q-learning is a model-free, on-policy reinforcement learning technique that aims to find an action-value function that gives the expected utility (reinforcement) of taking a given action in a given state and following a fixed policy thereafter.

One of the strengths of Q-learning is that it needs only a reinforcement function to be given (i.e. a function which tells how well, or how bad the agent is performing). During the learning process, the agent needs to balance exploitation (acting greedily in terms of current action-value function) vs exploration (action randomly to discover new states or better actions then currently estimated). A common simple example for handling this issue is using an epsilon-greedy policy.

447 questions
2
votes
1 answer

Q-learning value update

I am working on the power management of a device using Q-learning algorithm. The device has two power modes, i.e., idle and sleep. When the device is asleep, the requests for processing are buffered in a queue. The Q-learning algorithm looks for…
1
vote
1 answer

What is the difference between grid[index] VS grid[index, :] in python

In this https://colab.research.google.com/drive/1gS2aJo711XJodqqPIVIbzgX1ktZzS8d8?usp=sharing , they used np.max(qtable[new_state, :]) But I did an experiment and I don't understand the need of : . My experiement show the same value, same array…
TSR
  • 17,242
  • 27
  • 93
  • 197
1
vote
0 answers

Defining state and action for Q-learning in the code

I am trying to understand the following code for the simulator to avoid collision with the help of Q-learning. The examples and tutorials which I followed had the space divided into blocks such as taxiv3, so it was rather easier to define state and…
1
vote
0 answers

Converting gym.box into gym.discrete in openAI gym

I'm trying to implement Q Learning algorithm over some of the test beds in gym OpenAI and was trying to convert some of the space since different environment have different action and observation spaces. I'm aware of the existence of wrappers but…
1
vote
0 answers

My neural network can't solve the maze with tensorflow.net and qlearning

I am practicing neural networks with TensorFlow and QLearning. For my project I work in C# to be able to migrate later my program on the Unity game engine. I use the TensorFlow.Net library : https://github.com/SciSharp/TensorFlow.NET To begin my…
1
vote
0 answers

State representation Job Shop Scheduling

I am trying to do a job shop scheduling which is solved with a reinfrocement q learning agent. This is what i got right now: https://git.uni-wuppertal.de/1523811/tmp Unfortunately the agent does not learn well. I think I have to overthink the state…
Marvin
  • 11
  • 2
1
vote
1 answer

Implementing SARSA from Q-Learning algorithm in the frozen lake game

I am solving the frozen lake game using Q-Learning and SARSA algorithms. I have the code implementation of the Q-Learning algorithm and that works. This code was taken from Chapter 5 of "Deep Reinforcement Learning Hands-on" by Maxim Lapan. I am…
ronanwa
  • 11
  • 1
1
vote
1 answer

ValueError: Input 0 of layer sequential_5 is incompatible with the layer: : expected min_ndim=4, found ndim=2. Full shape received: [None, 953]

import pyautogui as pg from PIL import ImageGrab, ImageOps import numpy as np import tensorflow as tf from tensorflow import keras from time import sleep from tensorflow.keras.layers import * import cv2 import random box = (0,259,953,428) def…
1
vote
1 answer

Update DOM from loop in JavaScript

I am making a maze solver via Q Learning algorithm. I have a width X height maze that is generated randomly. Each cell of the maze is a div. I have CSS code for different type cells. .wall { background-color: red; } .currentPosition { …
I2mNew
  • 123
  • 11
1
vote
0 answers

How to implement Q-Learning on an array-backed grid?

I am trying to implement the Q-Learning Algorithm on a randomly generated maze that I created using Numpy and visualized using PyGame. The shape of the array is 50x50 as shown below: The red boxes represent obstacles (indicated with a 1) The white…
1
vote
1 answer

Convergence time of Q-learning Vs Deep Q-learning

I want to know about the convergence time of Deep Q-learning vs Q-learning when run on same problem. Can anyone give me an idea about the pattern between them? It will be better if it is explained with a graph.
A_Tahsin
  • 31
  • 3
1
vote
1 answer

q table with gym (using box observation space)

I'm trying to run a q-learning algorithm with this observation space: self.observation_space = spaces.Box(low=np.array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0]), high=np.array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1]), dtype=np.flo when im trying to access the q…
user3599541
  • 103
  • 9
1
vote
1 answer

Deep Q learning not performing well for algo-trading

I have implemented a deep q learning in Python using keras framework to reproduce a paper's results. However, It is not working. Here is some info : the training is done in 10000 step for the agent for 2 possible actions the input vector's shape is…
1
vote
1 answer

Getting a state from gym-minigrid for Q-learning

I'm trying to create a Q-learner in the gym-minigrid environment, based on an implementation I found online. The implementation works just fine, but it uses the normal Open AI Gym environment, which has access to some variables that are not present,…
JoJoDeath
  • 75
  • 1
  • 1
  • 6
1
vote
1 answer

How does work this implementation of DQN algorithm on TensorFlowJs?

devs, I found a bunch of examples of DQN implementations, but because I'm no TensorFlow expert, I'm a little bit confused. Let's see here is one one of them. I can understand, on the 73rd line, we slice some batch of stored data [{state, action,…
Vadim
  • 13
  • 5