What is wrong with my minimax algorithm for tictactoe

Question

I'm building a tic tac toe game for a fun learning experience. I have constructed a minimax algorithm to return the optimal move for the computer, but somehow I am going wrong and get wierd output like this

TIC TAC TOE V1.0
---
---
---
Enter row, column of your move
1,1
---
-X-
---
...
0, 0: -1038
0, 1: -1470
0, 2: -1038
1, 0: -1470
1, 2: -1470
2, 0: -1038
2, 1: -1470
2, 2: -1038
O--
-X-
---
Enter row, column of your move
1,2
O--
-XX
---
...
0, 1: -15
0, 2: -9
1, 0: -10
2, 0: -1
2, 1: -29
2, 2: -41
O--
-XX
O--
Enter row, column of your move
1,0
O--
XXX
O--
WINNER: PLAYER

You can see that the computer chose the lower left corner rather than cutting off the player. My code attempts to recursively flip flop between turns through all possible gamestates, summing up the score for each win or loss the turn may lead to, then returns the move with the maximum score. The printout is the score of each turn before it is made (you can see it chooses the highest), so why am I not cutting off the player? How can I fix this? Here is my code.

int compMoveScoreRecursive(state_t **board, int dimension, int row, int col, state_t turn) {
    board[row][col] = turn;
    state_t winner = checkWinner(board, dimension);
    if (winner == COMPUTER) {
        return 1;
    } else if (winner == PLAYER) {
        return -1;
    } else {
        int score = 0;
        state_t nextTurn = turn == COMPUTER ? PLAYER : COMPUTER;
        for (int i = 0; i < dimension; i++) {
            for (int j = 0; j < dimension; j++) {
                if (board[i][j] == NIL) {
                    state_t **boardCopy = copyBoard(board, dimension);
                    score += compMoveScoreRecursive(boardCopy, dimension, i, j, nextTurn);
                    destroyBoard(boardCopy, dimension);
                }
            }
        }
        return score;
    }
}

move_t optimalCompMove(state_t **board, int dimension) {
    move_t optMove;
    int optScore = INT_MIN;
    for (int row = 0; row < dimension; row++) {
        for (int col = 0; col < dimension; col++) {
            if (board[row][col] == NIL) {
                state_t **boardCopy = copyBoard(board, dimension);
                int score = compMoveScoreRecursive(boardCopy, dimension, row, col, COMPUTER);
                printf("%d, %d: %d\n", row, col, score);
                if (score > optScore) {
                    optMove.row = row;
                    optMove.col = col;
                    optScore = score;
                }
                destroyBoard(boardCopy, dimension);
            }
        }
    }
    return optMove;
}

Print the prospect boards during each recursion. The results may surprise you. — WhozCraig, Jul 24 '15 at 18:40

score 4 · Accepted Answer · answered Jul 24 '15 at 18:57

The concept of the minmax algorithm is to « minimize the maximum loss » (Wikipedia), so the first thing wrong with your algorithm is your sum.

For any state S of the game, and for any move M avaialble for the current player (let's say player 1 P1), the value of minmax (S + M, P2) is the maximum possible output for P2 if P1 plays M. So if P1 wants to maximize his chance to win, he should reduce as much as possible the maximum output for P2, i.e. he must find the minimum of the outputs.

In tictactoe minmax, it is possible to test the whole game (at most 9 moves), meaning the you always now if PX win (1), lose (-1) or make a draw (0). So minmax (state, PX) will return only one of these three value.

In many game, you cannot play the whole game (draughts for example), so the returned value is an indication of the state, for instance -oo if you lose, +oo if you win, otherwize the difference between your number of draughts and your opponent.

Does this mean that I may at any time have several moves to choose from with the same minmax value? i.e. the possible moves yield +1, +1, -1, 0, +1. How would I choose between these? or does it not matter? — shane, Jul 24 '15 at 19:05
Yes you will find multiple move with the same value, the choice does not matter for a minmax algorithm. — Holt, Jul 24 '15 at 19:07

score 2 · Answer 2 · answered Jul 24 '15 at 18:42

2

To my understanding, in the implementation of compMoveScoreRecursive, the recursively calculated score is added via

score += compMoveScoreRecursive(boardCopy, dimension, i, j, nextTurn);

instead of maximizing or minimizing the value. However, the value to be returned shoud be maximized on minimized, depending on the argument turn, which is also the reason why the approach is called MinMax.

answered Jul 24 '15 at 18:42

Codor

17,447
9
29
56

Your understanding is right, in a state `state`, for any `move` available to `p1`, `minmax (state + move, p2)` should be the **maximum** possible output for `p2` if `p1` plays `move`, so in order to maximize his chance to win, `p1` must play the `move` that minimize the maximum output of `p2`. – Holt Jul 24 '15 at 18:47
so rather than adding the scores of each possible move, I need to return simply a +1 or -1 depending on whether the opponent can reach a winning state from that point of play? – shane Jul 24 '15 at 18:52
Exactly; the value of a move should be in {0,-1,1} expressing which player wins the game (or 0 for a draw game) under the assumption that the game is played perfectly, which means that each subsequent move is also optimal. – Codor Jul 24 '15 at 19:03
How would I choose between moves with the same score or does it not matter – shane Jul 24 '15 at 19:06
It does mit matter, there may be more than one optimal move. – Codor Jul 24 '15 at 19:17

score 0 · Answer 3 · answered Jul 24 '15 at 18:40

It seems like the concept behind your algorithm is flawed. Based on the way you described it, you're considering every single line of play, as opposed to assuming that the opponent is going to make the correct move. Because of this, the fact that the opponent can win with the next move has very little weight, because you also consider all of the options that the other 4 moves offer (despite the fact that those will obviously never be made). You'll have to refine your algorithm to properly min-max as opposed to doing a search of the entire set of board states

What is wrong with my minimax algorithm for tictactoe

3 Answers3