What are the states and rewards in the reward matrix?

Question

This code :

R = ql.matrix([ [0,0,0,0,1,0],
        [0,0,0,1,0,1],
        [0,0,100,1,0,0],
        [0,1,1,0,1,0],
        [1,0,0,1,0,0],
        [0,1,0,0,0,0] ])

is from :

https://github.com/PacktPublishing/Artificial-Intelligence-By-Example/blob/47bed1a88db2c9577c492f950069f58353375cfe/Chapter01/MDP.py

R is defined as the "Reward matrix for each state" . What are the states and rewards in this matrix ?

# Reward for state 0
print('R[0,]:' , R[0,])

# Reward for state 0
print('R[1,]:' , R[1,])

prints :

R[0,]: [[0 0 0 0 1 0]]
R[1,]: [[0 0 0 1 0 1]]

Is [0 0 0 0 1 0] state0 & [0 0 0 1 0 1] state1 ?

score 1 · Accepted Answer · answered Feb 09 '20 at 22:15

1

According to the book that uses that example, R represents the reward of the transitions from one current state s to another next state s'.

Specifically, R is associated with the following graph:

Each line in the matrix R represents a letter from A to F, and each column represents a letter from A to F. The 1 values represent the nodes of the graphs. I.e., R[0,]: [[0 0 0 0 1 0]] means that you can go from state s=A to next state s'=E and receive a reward of 1. Similarly, R[1,]: [[0 0 0 1 0 1]] means that you receive a reward of 1 if you go from B to F or D. The goal seems to be achieving and remaining in C, which obtains the largest reward.

answered Feb 09 '20 at 22:15

Pablo EM

6,190
3
29
37

thanks for sharing, can you link to an online version of the book or is this from a book copy you own ? – blue-sky Feb 09 '20 at 22:26
1

Since this part of the book corresponds to the first chapter, I was using the "first chapter preview" in Amazon. Not very useful in general, but enough for answering your question. Sorry! – Pablo EM Feb 10 '20 at 08:02

What are the states and rewards in the reward matrix?

1 Answers1