Questions tagged [reinforcement-learning]

Reinforcement learning is an area of machine learning and computer science concerned with how to select an action in a state that maximizes a numerical reward in a particular environment.

NOTE: If you want to use this tag for a question not directly concerning implementation, then consider posting on Cross Validated, Data Science, Artificial Intelligence, or Computer Science instead. Otherwise you're probably off-topic.

Reinforcement learning is learning what to do--how to map situations to actions--so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them. In the most interesting and challenging cases, actions may affect not only the immediate reward but also the next situation and, through that, all subsequent rewards. These two characteristics--trial-and-error search and delayed reward--are the two most important distinguishing features of reinforcement learning.

From Reinforcement Learning: An Introduction

Significant Literature

Reinforcement Learning: An Introduction by Richard S. Sutton and Andrew G. Barto, MIT Press, Cambridge, MA, 1998

External Links

Related Tags

2632 questions

148

votes

7 answers

How to train an artificial neural network to play Diablo 2 using visual input?

I'm currently trying to get an ANN to play a video game and and I was hoping to get some help from the wonderful community here. I've settled on Diablo 2. Game play is thus in real-time and from an isometric viewpoint, with the player controlling a…

machine-learning computer-vision neural-network video-processing reinforcement-learning

asked Jun 30 '11 at 23:47

zergylord

4,368
5
38
60

146

votes

8 answers

What is the difference between Q-learning and SARSA?

Although I know that SARSA is on-policy while Q-learning is off-policy, when looking at their formulas it's hard (to me) to see any difference between these two algorithms. According to the book Reinforcement Learning: An Introduction (by Sutton and…

artificial-intelligence reinforcement-learning q-learning sarsa

asked Jul 27 '11 at 17:46

Ælex

14,432
20
88
129

140

votes

5 answers

What is the difference between value iteration and policy iteration?

In reinforcement learning, what is the difference between policy iteration and value iteration? As much as I understand, in value iteration, you use the Bellman equation to solve for the optimal policy, whereas, in policy iteration, you randomly…

machine-learning reinforcement-learning markov-models value-iteration

asked May 22 '16 at 02:43

Arslán

1,711
2
12
15

votes

2 answers

Training a Neural Network with Reinforcement learning

I know the basics of feedforward neural networks, and how to train them using the backpropagation algorithm, but I'm looking for an algorithm than I can use for training an ANN online with reinforcement learning. For example, the cart pole swing up…

algorithm language-agnostic machine-learning neural-network reinforcement-learning

asked May 23 '12 at 14:27

Kendall Frey

43,130
20
110
148

votes

4 answers

What is the way to understand Proximal Policy Optimization Algorithm in RL?

I know the basics of Reinforcement Learning, but what terms it's necessary to understand to be able read arxiv PPO paper ? What is the roadmap to learn and use PPO ?

machine-learning reinforcement-learning

asked Sep 26 '17 at 09:36

Alexander Cyberman

2,114
3
20
21

votes

6 answers

How can I apply reinforcement learning to continuous action spaces?

I'm trying to get an agent to learn the mouse movements necessary to best perform some task in a reinforcement learning setting (i.e. the reward signal is the only feedback for learning). I'm hoping to use the Q-learning technique, but while I've…

algorithm machine-learning reinforcement-learning q-learning

asked Aug 17 '11 at 19:54

zergylord

4,368
5
38
60

votes

3 answers

What is a policy in reinforcement learning?

I've seen such words as: A policy defines the learning agent's way of behaving at a given time. Roughly speaking, a policy is a mapping from perceived states of the environment to actions to be taken when in those states. But still didn't fully…

machine-learning terminology reinforcement-learning markov-decision-process

asked Sep 17 '17 at 04:52

Alexander Cyberman

2,114
3
20
21

votes

3 answers

What is the difference between Q-learning and Value Iteration?

How is Q-learning different from value iteration in reinforcement learning? I know Q-learning is model-free and training samples are transitions (s, a, s', r). But since we know the transitions and the reward for every transition in Q-learning, is…

machine-learning artificial-intelligence reinforcement-learning q-learning

asked Mar 09 '15 at 08:32

huskywolf

votes

1 answer

OpenAI Gym: Understanding `action_space` notation (spaces.Box)

I want to setup an RL agent on the OpenAI CarRacing-v0 environment, but before that I want to understand the action space. In the code on github line 119 says: self.action_space = spaces.Box( np.array([-1,0,0]), np.array([+1,+1,+1])) # steer, gas,…

reinforcement-learning openai-gym

asked Jun 07 '17 at 05:33

Toke Faurby

5,788
9
41
62

votes

2 answers

What is the difference between reinforcement learning and deep RL?

What is the difference between deep reinforcement learning and reinforcement learning? I basically know what reinforcement learning is about, but what does the concrete term deep stand for in this context?

machine-learning reinforcement-learning q-learning

asked Jun 22 '16 at 16:00

Christopher Klaus

votes

5 answers

When should I use support vector machines as opposed to artificial neural networks?

I know SVMs are supposedly 'ANN killers' in that they automatically select representation complexity and find a global optimum (see here for some SVM praising quotes). But here is where I'm unclear -- do all of these claims of superiority hold for…

machine-learning neural-network svm reinforcement-learning

asked Jul 14 '11 at 20:00

zergylord

4,368
5
38
60

votes

4 answers

Openai gym environment for multi-agent games

Is it possible to use openai's gym environments for multi-agent games? Specifically, I would like to model a card game with four players (agents). The player scoring a turn starts the next turn. How would I model the necessary coordination between…

reinforcement-learning openai-gym

asked Jun 05 '17 at 13:19

Martin Studer

2,213
1
18
23

votes

2 answers

Tensorflow and Multiprocessing: Passing Sessions

I have recently been working on a project that uses a neural network for virtual robot control. I used tensorflow to code it up and it runs smoothly. So far, I used sequential simulations to evaluate how good the neural network is, however, I want…

python parallel-processing multiprocessing tensorflow reinforcement-learning

asked Apr 13 '16 at 21:54

MrRed

votes

6 answers

NameError: name 'base' is not defined OpenAI Gym

[Note that I am using xvfb-run -s "-screen 0 1400x900x24" jupyter notebook] I try to run a basic set of commands in OpenAI Gym import gym env = gym.make("CartPole-v0") obs = env.reset() env.render() but I get the following…

reinforcement-learning openai-gym

asked Nov 25 '18 at 23:11

midawn98

votes

9 answers

Good implementations of reinforcement learning?

For an ai-class project I need to implement a reinforcement learning algorithm which beats a simple game of tetris. The game is written in Java and we have the source code. I know the basics of reinforcement learning theory but was wondering if…

language-agnostic artificial-intelligence machine-learning reinforcement-learning

asked Apr 11 '09 at 16:32

bdd

3,436
5
31
43

2 3

…

99 100 Next