I'm trying to solve Three Men's Morris. The details of the game don't matter, that that it's a game similar to tic tac toe, but players may be able to force a win from some positions, or be able to force the game to repeat forever by playing the same moves over and over in other positions. So I want to make a function to tell whether a player can force a win, or force a draw by repetition.
I've tried using simple negamax, which works fine but is way too slow to traverse the game tree with unlimited depth. I want to use transposition tables since the number of possible positions is very low (<6000) but that's where my problem comes from. As soon as I add in the transposition table (just a list of all fully searched positions and their values, 0, 1, or -1) the AI starts making weird moves, suddenly saying its a draw in positions where I have a forced win.
I think the problem comes from transposition table entries being saved as draws, since it seemed to work when I limited the depth and only saved forced wins, but I'm not sure how to fix the problem and allow for unlimited depth.
Here's the code in case there's an issue with my implementation:
int evaluate(ThreeMensMorris &board){
//game is won or drawn
if(board.isGameWon()) return -1; //current player lost
if(board.isRepetition()) return 0; //draw by repetition
//check if this position is already in the transposition table
//if so, return its value
uint32_t pos = board.getPosInt();
for(int i = 0; i < transIdx; i++)
if(transList[i] == pos)
return valueList[i];
//negamax
//NOTE: moves are formatted as two numbers, "from" and "to",
//where "to" is -1 to place a piece for the first time
//so this nested for loop goes over all possible moves
int bestValue = -100;
for(int i = 0; i < 9; i++){
for(int j = -1; j < 9; j++){
if(!board.makeMove(i, j)) continue; //illegal move
int value = -1 * evaluate(board, depth+1);
board.unmakeMove(i, j);
if(value > bestValue) bestValue = value;
}
}
//we have a new position complete with a value, push it to the end of the list
transList[transIdx] = pos;
valueList[transIdx] = bestValue;
transIdx++;
return bestValue;
}