I'm trying to implement an abstract minimax algorithm with alpha beta pruning. The minimax part works great, but as soon as I add alpha beta pruning, the IA starts to behave really silly, even skipping obvious moves. I'm not sure what's going on.
This is how my recursive function looks like:
- (id<MMGameMove>)getBestMove:(id<MMGame>)game player:(MMPlayerSeed)player depth:(NSInteger)depth alpha:(NSInteger)alpha beta:(NSInteger)beta
{
id<MMGameMove> bestMove = nil;
NSArray *allMoves = [game allMoves];
for (id<MMGameMove> move in allMoves)
{
//Take the move and evaluate the game's score
id<MMGame> gameBoard = [game clone];
move.player = player;
[gameBoard saveMove:move];
self.count++;
if (depth == 0 || gameBoard.isOver)
{
move.rank = [gameBoard scoreForPlayer:self.playerId depth:depth];
}
else
{
MMPlayerSeed opponent = (player == self.playerId) ? self.opponentId : self.playerId;
move.rank = [self getBestMove:gameBoard player:opponent depth:depth-1 alpha:alpha beta:beta].rank;
}
//If the new move is better than our previous move, take it
BOOL minMove = (player == self.opponentId && move.rank <= beta);
BOOL maxMove = (player == self.playerId && move.rank >= alpha);
if (minMove || maxMove)
{
BOOL shouldPrune = NO;
if (minMove)
{
beta = move.rank;
if (alpha >= beta) {
shouldPrune = YES;
}
}
else if (maxMove)
{
alpha = move.rank;
if (alpha <= beta) {
shouldPrune = YES;
}
}
bestMove = move;
if (shouldPrune && depth < self.maxDepth) {
break;
}
}
}
return bestMove;
}
My initial call looks like this:
[self getBestMove:game player:self.playerId depth:self.maxDepth alpha:INT_MIN beta:INT_MAX];
As far as I understand, for the same game state, alpha-beta pruning should give me the very same move as minimax does without it, but with this implementation this is clearly not the case.
EDIT 1
After the suggested modification there were another error and it was that I was pruning root nodes. I edited the code to reflect the correct answer. After doing this and running both minimax with and without alpha-beta pruning I can see now that both yield the same result and also I was able to check the better performance you got out of alpha beta addition.
EDIT 2
The code posted above is not working as expected actually. I followed xXliolauXx suggestion, but I still can't get it working. I get the right values at depth = 0 or game over, but it seems they are not passed back recursively to the corresponding root move. For example, I'm able to see that my heuristic returns -3 for a child of the first root move and 0 for the rest of its childs. So I'm expecting that the first root move reports -3 instead of 0, since that's the worst case scenario the computer might find if it does that move.
Here's my new code:
- (NSInteger)alphabeta:(id<MMGame>)game player:(MMPlayerSeed)player depth:(NSInteger)depth alpha:(NSInteger)alpha beta:(NSInteger)beta
{
if (depth == 0 || game.isOver)
{
return [game scoreForPlayer:self.playerId depth:depth];
}
MMPlayerSeed opponent = (player == self.playerId) ? self.opponentId : self.playerId;
for (id<MMGameMove> move in game.allMoves)
{
id<MMGame> gameCopy = [game clone];
move.player = player;
[gameCopy saveMove:move];
self.count++;
NSInteger score = [self alphabeta:gameCopy player:opponent depth:depth-1 alpha:alpha beta:beta];
if (player == self.playerId)
{
if (depth == self.maxDepth)
{
move.rank = @(score);
[self.rootMoves addObject:move];
}
alpha = MAX(alpha, score);
if (beta < alpha)
{
break;
}
}
else
{
beta = MIN(beta, score);
if (beta < alpha)
{
break;
}
}
}
return (player == self.playerId) ? alpha : beta;
}
Note that I prune when beta < alpha while maximizing. Otherwise, it will always prune after scanning the first root move.
This is how I initiate the recursion:
[self alphabeta:game player:self.playerId depth:self.maxDepth alpha:-INFINITY beta:INFINITY];
EDIT 3
I think I got it. Rather than return alpha or beta, I return the best (or worst) score. I need to cleanup my code to make it more readable, but here's how it looks like right now:
- (NSInteger)alphabeta:(id<MMGame>)game player:(MMPlayerSeed)player depth:(NSInteger)depth alpha:(NSInteger)alpha beta:(NSInteger)beta
{
if (depth == 0 || game.isOver)
{
return [game scoreForPlayer:self.playerId depth:depth];
}
MMPlayerSeed opponent;
NSInteger bestScore;
if (player == self.playerId)
{
opponent = self.opponentId;
bestScore = -INFINITY;
}
else
{
opponent = self.playerId;
bestScore = INFINITY;
}
for (id<MMGameMove> move in game.allMoves)
{
id<MMGame> gameCopy = [game clone];
move.player = player;
[gameCopy saveMove:move];
self.count++;
NSInteger score = [self alphabeta:gameCopy player:opponent depth:depth-1 alpha:alpha beta:beta];
if (player == self.playerId)
{
bestScore = MAX(bestScore, score);
alpha = MAX(alpha, bestScore);
if (depth == self.maxDepth)
{
move.rank = @(score);
[self.rootMoves addObject:move];
}
if (beta < alpha)
{
break;
}
}
else
{
bestScore = MIN(bestScore, score);
beta = MIN(beta, bestScore);
if (beta < alpha)
{
break;
}
}
}
return bestScore;
}