5

I recently enrolled to CS50 AI python course and one the projects to do is to implement a minimax algorithm for a tictactoe game. I looked for help and searched stackoverflow but I didn't find an answer that could help me. The graphic part of it is already implemented, and all you need to do is to program the given functions of a template, and I believe I got them right with the only exception of the algorithm part, the functions are the following:

import math
import copy

X = "X"
O = "O"
EMPTY = None


def initial_state():
    """
    Returns starting state of the board.
    """
    return [[EMPTY, EMPTY, EMPTY],
            [EMPTY, EMPTY, EMPTY],
            [EMPTY, EMPTY, EMPTY]]


def player(board):
    """
    Returns player who has the next turn on a board.
    """
    if board == initial_state():
        return X

    xcounter = 0
    ocounter = 0
    for row in board:
        xcounter += row.count(X)
        ocounter += row.count(O)

    if xcounter == ocounter:
        return X
    else:
        return O


def actions(board):
    """
    Returns set of all possible actions (i, j) available on the board.
    """
    possible_moves = []
    for i in range(3):
        for j in range(3):
            if board[i][j] == EMPTY:
                possible_moves.append([i, j])
    return possible_moves


def result(board, action):
    """
    Returns the board that results from making move (i, j) on the board.
    """
    boardcopy = copy.deepcopy(board)
    try:
        if boardcopy[action[0]][action[1]] != EMPTY:
            raise IndexError
        else:
            boardcopy[action[0]][action[1]] = player(boardcopy)
            return boardcopy
    except IndexError:
        print('Spot already occupied')


def winner(board):
    """
    Returns the winner of the game, if there is one.
    """
    columns = []
    # Checks rows
    for row in board:
        xcounter = row.count(X)
        ocounter = row.count(O)
        if xcounter == 3:
            return X
        if ocounter == 3:
            return O

    # Checks columns
    for j in range(len(board)):
        column = [row[j] for row in board]
        columns.append(column)

    for j in columns:
        xcounter = j.count(X)
        ocounter = j.count(O)
        if xcounter == 3:
            return X
        if ocounter == 3:
            return O

    # Checks diagonals
    if board[0][0] == O and board[1][1] == O and board[2][2] == O:
        return O
    if board[0][0] == X and board[1][1] == X and board[2][2] == X:
        return X
    if board[0][2] == O and board[1][1] == O and board[2][0] == O:
        return O
    if board[0][2] == X and board[1][1] == X and board[2][0] == X:
        return X

    # No winner/tie
    return None


def terminal(board):
    """
    Returns True if game is over, False otherwise.
    """
    # Checks if board is full or if there is a winner
    empty_counter = 0
    for row in board:
        empty_counter += row.count(EMPTY)
    if empty_counter == 0:
        return True
    elif winner(board) is not None:
        return True
    else:
        return False


def utility(board):
    """
    Returns 1 if X has won the game, -1 if O has won, 0 otherwise.
    """
    if winner(board) == X:
        return 1
    elif winner(board) == O:
        return -1
    else:
        return 0


def minimax(board):
    current_player = player(board)

    if current_player == X:
        v = -math.inf
        for action in actions(board):
            k = min_value(result(board, action))    #FIXED
            if k > v:
                v = k
                best_move = action
    else:
        v = math.inf
        for action in actions(board):
            k = max_value(result(board, action))    #FIXED
            if k < v:
                v = k
                best_move = action
    return best_move

def max_value(board):
    if terminal(board):
        return utility(board)
    v = -math.inf
    for action in actions(board):
        v = max(v, min_value(result(board, action)))
    return v    #FIXED

def min_value(board):
    if terminal(board):
        return utility(board)
    v = math.inf
    for action in actions(board):
        v = min(v, max_value(result(board, action)))
    return v    #FIXED

The last part is where the minimax(board) function is located, it's supposed to take the current state of the board and calculate the best possible move depending on whether the AI is player 'X' or 'O' (it can be any of the two), the 'X' player tries to maximize the score and the 'O' should minimize it utilizing the utility(board) function that returns a 1 for X winning, -1 for 'O' winning or 0 for ties. So far the AI moves are not optimal and I can easily win against it when I shouldn't, as in the best scenario all I should get is a tie because the AI is supposed to calculate every possible move at that point. But I don't know whats wrong...

Albert
  • 63
  • 1
  • 4
  • 1
    Why are you editing your code once you get the answer? It defeats the purpose of the Q and A if you just update the code with the solution. – Krish Jun 16 '20 at 21:16

3 Answers3

1

First a word on debugging: If you were to print the computation done in the recursive calls you could trace the execution of the problem and quickly find the answer.

But, your issue seems to be at the top of the tree. In your minimax call, if the current player is X you call the max_value on each of the children of the state, and then take the max of that results. However, that applies the max function twice at the top of the tree. The next player in the game is O, so you should be calling the min_value function for the next player.

So, in the minimax call you should be calling min_value if the current_player is X and max_value if the current_player is O.

Nathan S.
  • 5,244
  • 3
  • 45
  • 55
  • Your observation was right! it was as if the computer assumed the player had taken 2 turns in a row... thats fixed now. I then followed your debugging advice and found out that the Max/Min functions weren't looping through all the actions because they reached the return v line at the very first loop/action, so I had an indentation problem as the return v line had to be outside the for loop to allow all the iterations. Thank you very much – Albert Apr 07 '20 at 07:03
0

@harsh-kothari, the cs50 project page says

Importantly, the original board should be left unmodified: since Minimax will ultimately require considering many different board states during its computation. This means that simply updating a cell in board itself is not a correct implementation of the result function. You’ll likely want to make a deep copy of the board first before making any changes.

To avoid sub-list from being modified, we use deepcopy instead of copy

Read more about copy() and deepcopy() here

chaha0s
  • 118
  • 7
0

change the actions(board) code to this

possibleActions = set()

for i in range(0, len(board)):
    for j in range(0, len(board[0])):
        if board[i][j] == EMPTY:
            possibleActions.add((i, j))

return possibleActions