I am building a single player board game as a hobby and a Q - learner for it. I will create a table for rewards(state, action) as the philosophy of q learning. I will take each board state after a key press as a 'state' and board is vector<vector<int > > Board
. There are always 8 possible key presses in each action, and code will need to compare the state whether it matches a previously explored state and reevaluate the rewards accordingly. If not it will push/insert it as a new state. So it needs to compare vector of vectors that stores int in a fast manner and actions will be the 2nd dimension for the table for rewards. What kind of approach should I take to compare? maps/sets? Anything else?
Asked
Active
Viewed 46 times
0

NONONONONO
- 612
- 1
- 6
- 10
-
first thought would be to try a hashing strategy – kmdreko Sep 19 '17 at 09:33
-
@vu1p3n0x can you further example one?. I am new to C. – NONONONONO Sep 19 '17 at 11:54