0

I am doing my final year project on Machine Learning for checkers Computer game.

In this game i have automated one player ( which moves randomly ) and i want second player to learn against the randomness and become smart by more games and trials.

As i said that first player is automated so it's working quite fine, but talking about the second player , i am having some problem with it's moves.

I am using target function as

  1. v(b) = w0+w1x1+w2x2+w3x3+w4x4+w5x5+w6x6

    where x1=number of white piece x2=number of black piece x3=number of white king x4=number of black king x5=number of white pieces that are threatened x6=number of black pieces that are threatened

    and w0 to w6 are weights which are to be learned by the algorithm.

  2. Now if i take initial board state and put weights random. by putting random weights (12,-15,6,19,-5,3) we get v(b) = -36 as initially x1= 12 x2= 12 x3= 0 x4= 0 x5= 0 x6= 0

    so v(b) = -36

  3. But -36 dosen't give me a valid position to move and learn further.

How to get a predictive value to be moved?

It would be a great help if you can contribute to my problem and put efforts to solve it.

jrenk
  • 1,387
  • 3
  • 24
  • 46

2 Answers2

0

This is not a linear problem. Try using this resource and let me know if it helps : https://kartikkukreja.wordpress.com/2015/07/12/creating-a-bot-for-checkers/

pissall
  • 7,109
  • 2
  • 25
  • 45
  • elaborate on links; see: https://stackoverflow.com/help/how-to-answer and https://meta.stackexchange.com/questions/8231/are-answers-that-just-contain-links-elsewhere-really-good-answers – Mohammad Athar Oct 24 '17 at 18:28
0

I got the answer. My mentor told me that we have to find out target function( v(b) ) for each possible move during a given movement and compare them all....whichever is maximum we will go forward with that move and update the weight values accordingly.For example this is the situation at the first turn :

+ - + - + - + - + - + - + - + - +
|   |w32|   |w31|   |w30|   |w29|
+ - + - + - + - + - + - + - + - +
|w28|   |w27|   |w26|   |w25|   |
+ - + - + - + - + - + - + - + - +
|   |w24|   |w23|   |w22|   |w21|
+ - + - + - + - + - + - + - + - +
| 20|   | 19|   | 18|   | 17|   |
+ - + - + - + - + - + - + - + - +
|   | 16|   | 15|   | 14|   | 13|
+ - + - + - + - + - + - + - + - +
|b12|   |b11|   |b10|   |b9 |   |
+ - + - + - + - + - + - + - + - +
|   |b8 |   |b7 |   |b6 |   |b5 |
+ - + - + - + - + - + - + - + - +
|b4 |   |b3 |   |b2 |   |b1 |   |
+ - + - + - + - + - + - + - + - +

Ours is black side : Turn 1

These are possible moves available at this movement :

Move 0: 9 to 13
Move 1: 10 to 14
Move 2: 11 to 15
Move 3: 12 to 16
Move 4: 9 to 14
Move 5: 10 to 15
Move 6: 11 to 16

So we make a dictionary of moves number and their respective v(b)'s :

sample_dict = {0: 15.944312287271288, 1: 6.444167413927058, 2: 17.771995708404148, 3: 8.847647756243374, 4: 9.420835807993932, 5: 13.057118996697053,6: 18.71362388578158}

In this case your system will move corresponding to the maximum v(b) i.e. move [6].

So the move taken is "Move 6: 11 to 16".

That's how the system will learn how to take better moves.

For two initial turns we will take random value for w0 to w6 between 0 and 1. But after second move we will update the w0 to w6 value regularly as we will have vtrain(b) and v(b) at that time.

One last thing that at any given movement all the v(b) calculated for each move will differ from each other by two value which are x5 and x6.So be careful with those.

Thanks all of you for responding so quickly.

Shown below are all the values of w0 to w6 after each game while making it play for 10 times :enter image description here

Community
  • 1
  • 1