Minimax algorithm in python using tic tac toe

Question

I'm trying to make tic tac toe AI, which plays the game optimally by using minimax algorithm. I got it to work only to notice it does not make optimal moves and putting it against itself results always win for 'X' player (It should result in draw). Here is my code for algorithm:

def getBestMove(state, player):
    '''
    Minimax Algorithm
    '''
    winner_loser , done = check_current_state(state)
    if done == "Done" and winner_loser == 'O': # If AI won
        return 1
    elif done == "Done" and winner_loser == 'X': # If Human won
        return -1
    elif done == "Draw":    # Draw condition
        return 0
        
    moves = []
    empty_cells = []
    for i in range(3):
        for j in range(3):
            if state[i][j] is ' ':
                empty_cells.append(i*3 + (j+1))
    
    for empty_cell in empty_cells:
        move = {}
        move['index'] = empty_cell
        new_state = copy_game_state(state)
        play_move(new_state, player, empty_cell)
        
        if player == 'O':    # If AI
            result = getBestMove(new_state, 'X')    # make more depth tree for human
            move['score'] = result
        else:
            result = getBestMove(new_state, 'O')    # make more depth tree for AI
            move['score'] = result
        
        moves.append(move)

    # Find best move
    best_move = None
    if player == 'O':   # If AI player
        best = -infinity
        for move in moves:
            if move['score'] > best:
                best = move['score']
                best_move = move['index']
    else:
        best = infinity
        for move in moves:
            if move['score'] < best:
                best = move['score']
                best_move = move['index']
                
    return best_move

What can I do here to fix it?

I'm not a Python person, but `return best_move` - shouldn't that return the score? — 500 - Internal Server Error, Nov 02 '20 at 12:08
In e.g. `if move['score'] > best`, `move['score']` seems to be a numeric value, but with `return best_move` you seem to be returning a move (not the score), which in turn gets stored after the recursive calls. That seems wrong to me, but maybe Python is doing something implicitly here that I am unaware of. — 500 - Internal Server Error, Nov 02 '20 at 13:34

score 4 · Answer 1 · answered Nov 07 '20 at 17:11

I think it is easier if you follow the standard minimax algorithm which you can find for example here. I also suggest adding alpha-beta pruning to make it a bit faster, even though it is not really necessary in Tic Tac Toe. Here is an example of a game I made long ago that you can use for inspiration, it is basicall taken from the linked Wikipedia page, with some minor tweaks like if beta <= alpha for the alpha-beta pruning:

move, evaluation = minimax(board, 8, -math.inf, math.inf, True)

def minimax(board, depth, alpha, beta, maximizing_player):

    if depth == 0 or board.is_winner() or board.is_board_full():
        return None, evaluate(board)

    children = board.get_possible_moves(board)
    best_move = children[0]
    
    if maximizing_player:
        max_eval = -math.inf        
        for child in children:
            board_copy = copy.deepcopy(board)
            board_copy.board[child[0]][child[1]].player = 'O'
            current_eval = minimax(board_copy, depth - 1, alpha, beta, False)[1]
            if current_eval > max_eval:
                max_eval = current_eval
                best_move = child
            alpha = max(alpha, current_eval)
            if beta <= alpha:
                break
        return best_move, max_eval

    else:
        min_eval = math.inf
        for child in children:
            board_copy = copy.deepcopy(board)
            board_copy.board[child[0]][child[1]].player = 'X'
            current_eval = minimax(board_copy, depth - 1, alpha, beta, True)[1]
            if current_eval < min_eval:
                min_eval = current_eval
                best_move = child
            beta = min(beta, current_eval)
            if beta <= alpha:
                break
        return best_move, min_eval

def evaluate(board):
    if board.is_winner('X'):
        return -1
    if board.is_winner('O'):
        return 1
    return 0

Note that it is important to make a deepcopy of the board (or an unmake move function after the recursive minimax call), otherwise you are changing the state of the original board and will get some strange behaviours.

Minimax algorithm in python using tic tac toe

1 Answers1