Mancala AI completes the game and wins in one round instead of playing one turn

Question

I'm working on a Mancala game where players get to play against an AI. the code is complete, the Mancala game functionality is found within the Mancala_helpers, the AI algorithm is a MinMax tree and is found in the MinMax file, and finally the game itself. everything runs fine except for when the AI plays, if the AI starts the game immediately ends, it moves all the rocks from its pits in one round. and if I start I can only play one move before it does the same. I cannot understand what's happening, at first, I thought maybe I had a problem within the function of mancala helpers where they did not switch turns properly and the AI kept playing. but I ran multiple tests and that part is working fine. I cant identify the issue, help, please. if anyone also has suggestions for a better evaluation function then that would be great. thanks

--------------------------Mancala helpers--------------
# TODO: implement pad(num)
# Return a string representation of num that is always two characters wide.
# If num does not already have two digits, a leading "0" is inserted in front.
# This is called "padding".  For example, pad(12) is "12", and pad(1) is "01".
# You can assume num is either one or two digits long.
def pad(num: int) -> str:
    x = str(num)
    if len(x) > 1:
        return x 
    else:
        return "0"+x
# TODO: implement pad_all(nums)
# Return a new list whose elements are padded versions of the elements in nums.
# For example, pad_all([12, 1]) should return ["12", "01"].
# Your code should create a new list, and not modify the original list.
# You can assume each element of nums is an int with one or two digits.
def pad_all(nums: list) -> list:
    x = []
    for i in nums:
        x.append(pad(i))
    return x


# TODO: implement initial_state()
# Return a (player, board) tuple representing the initial game state
# The initial player is player 0.
# board is list of ints representing the initial mancala board at the start of the game.
# The list element at index p should be the number of gems at position p.
def initial_state() -> tuple:
    return (0, [4, 4, 4, 4, 4, 4, 0, 4, 4, 4, 4, 4, 4, 0])

# TODO: implement game_over(state)
# Return True if the game is over, and False otherwise.
# The game is over once all pits are empty.
# Your code should not modify the board list.
# The built-in functions "any" and "all" may be useful:
#   https://docs.python.org/3/library/functions.html#all
def game_over(state: tuple) -> bool:
    lst = state[1]
    if (lst[0] == lst[1] == lst[2] == lst[3] == lst[4] == lst[5] == 0) or (lst[7] == lst[8] == lst[9] == lst[10] == lst[11] == lst[12] == 0):
        return True
    else:
        return False

# TODO: implement valid_actions(state)
# state is a (player, board) tuple
# Return a list of all positions on the board where the current player can pick up gems.
# A position is a valid move if it is one of the player's pits and has 1 or more gems in it.
# For example, if all of player's pits are empty, you should return [].
# The positions in the returned list should be ordered from lowest to highest.
# Your code should not modify the board list.
def valid_actions(state: tuple) -> list:
    actions = []
    lst = state[1]
    player = state[0]
    if player == 0:
        for i in range(6):
            if lst[i] > 0:
                actions.append(i)
        return actions
    else:
        for i in range(6):
            if lst[i+7] >0: actions.append(i+7)
        return actions


# TODO: implement mancala_of(player)
# Return the numeric position of the given player's mancala.
# Player 0's mancala is on the right and player 1's mancala is on the left.
# You can assume player is either 0 or 1.
def mancala_of(player: int) -> int:
    if player ==0: return 6
    elif player==1: return 13

# TODO: implement pits_of(player)
# Return a list of numeric positions corresponding to the given player's pits.
# The positions in the list should be ordered from lowest to highest.
# Player 0's pits are on the bottom and player 1's pits are on the top.
# You can assume player is either 0 or 1.
def pits_of(player: int) -> list:
   if player ==0:
       return [0,1,2,3,4,5]
   elif player==1:
        return [7,8,9,10,11,12] 

# TODO: implement player_who_can_do(move)
# Return the player (either 0 or 1) who is allowed to perform the given move.
# The move is allowed if it is the position of one of the player's pits.
# For example, position 2 is one of player 0's pits.
# So player_who_can_do(2) should return 0.
# You can assume that move is a valid position for one of the players.
def player_who_can_do(move: int) -> int:
    if move in [0,1,2,3,4,5] : return 0
    elif move in [7,8,9,10,11,12]: return 1

# TODO: implement opposite_from(position)
# Return the position of the pit that is opposite from the given position.
# Check the pdf instructions for the definition of "opposite".

def opposite_from(position: int) -> int:
    d_p_1 = {}
    d_p_1[0]=12
    d_p_1[1]=11
    d_p_1[2]=10
    d_p_1[3]=9
    d_p_1[4]=8
    d_p_1[5]=7

    d_p_1[7]=5
    d_p_1[8]=4
    d_p_1[9]=3
    d_p_1[10]=2
    d_p_1[11]=1
    d_p_1[12]=0 
    return d_p_1[position]

# TODO: implement play_turn(move, board)
# Return the new game state after the given move is performed on the given board.
# The return value should be a tuple (new_player, new_board).
#   new_player should be the player (0 or 1) whose turn it is after the move.
#   new_board should be a list representing the new board state after the move.
#
# Parameters:
#   board is a list representing the current state of the game board before the turn is taken.
#   move is an int representing the position where the current player picks up gems.
# You can assume that move is a valid move for the current player who is taking their turn.
# Check the pdf instructions for the detailed rules of taking a turn.
#
# It may be helpful to use several of the functions you implemented above.
# You will also need control flow such as loops and if-statements.
# Lastly, the % (modulo) operator may be useful:
#  (x % y) returns the remainder of x / y
#  from: https://docs.python.org/3/library/stdtypes.html#numeric-types-int-float-complex

def play_turn(move: int, board: list) -> tuple:
    player = player_who_can_do(move)
    new_board = board
    gems = new_board[move]
    new_board[move] = 0

    hasht = {}
    hasht[0] =1
    hasht[1] = 0

    if player ==0: 
         x =0
         offset = 1
         gems_counter = gems
         for i in range(gems):
            if i + move + offset == 13: offset += 1
            elif (i+move+offset) - 14 == 13: offset += 1
            
            if i + move +offset > 13:
                gem_position = (i+move+offset) - 14
            else:
                gem_position = i + move + offset
            
            new_board[gem_position] += 1
            gems_counter -= 1
            if gems_counter ==0 and gem_position==6: x = 1

            if gems_counter==0 and gem_position in pits_of(0) and new_board[gem_position] == 1 and new_board[opposite_from(gem_position)] > 0:
                gems_from_myside = new_board[gem_position]
                gems_from_opside = new_board[opposite_from(gem_position)]
                new_board[6] = gems_from_myside+gems_from_opside
                new_board[gem_position] = 0
                new_board[opposite_from(gem_position)] = 0
         
         return (hasht[x],new_board)


    if player ==1:
         
         x_2 = 1
         offset = 1
         gems_counter2 = gems
         for i in range(gems):
            if i + move + offset == 6: offset += 1
            elif (i+move+offset) - 14 == 6: offset += 1
            
            if i + move +offset > 13:
                gem_position = (i+move+offset) - 14
            else:
                gem_position = i + move + offset
            
            new_board[gem_position] += 1
            gems_counter2 -= 1
            if gems_counter2 == 0 and gem_position == 13:  x_2 = 0

            if gems_counter2==0 and gem_position in pits_of(1) and new_board[gem_position] == 1 and new_board[opposite_from(gem_position)] > 0:
                gems_from_myside = new_board[gem_position]
                gems_from_opside = new_board[opposite_from(gem_position)]
                
                new_board[13] = gems_from_myside+gems_from_opside
                new_board[gem_position] = 0
                new_board[opposite_from(gem_position)] = 0
         
         
         return (hasht[x_2],new_board)
    



# TODO: implement clear_pits(board)
# Return a new list representing the game state after clearing the pits from the board.
# When clearing pits, any gems in a player's pits get moved to that player's mancala.
# Check the pdf instructions for more detail about clearing pits.
def clear_pits(board: list) -> list:

    length = len(board)
    middle_index = length // 2
    first_half = board[:middle_index]
    second_half = board[middle_index:]
    
    for i in range(6):
        first_half[6] += first_half[i]
        first_half[i]=0
        second_half[6] += second_half[i]
        second_half[i] = 0
    return (first_half+second_half) 
        




# This one is done for you.
# Plays a turn and clears pits if needed.
def perform_action(action, state):
    player, board = state
    new_player, new_board = play_turn(action, board)
    if 0 in [len(valid_actions((0, new_board))), len(valid_actions((1, new_board)))]:
        new_board = clear_pits(new_board)
    return new_player, new_board

# TODO: implement score_in(state)
# state is a (player, board) tuple
# Return the score in the given state.
# The score is the number of gems in player 0's mancala, minus the number of gems in player 1's mancala.
def score_in(state: tuple) -> int:
    lst = state[1]
    return lst[6] - lst[13]

# TODO: implement is_tied(board)
# Return True if the game is tied in the given board state, False otherwise.
# A game is tied if both players have the same number of gems in their mancalas.
# You can assume all pits have already been cleared on the given board.
def is_tied(board: list) -> bool:
    if board[mancala_of(0)] - board[mancala_of(1)] == 0: return True
    else: return False

# TODO: implement winner_of(board)
# Return the winning player (either 0 or 1) in the given board state.
# The winner is the player with more gems in their mancala.
# You can assume it is not a tied game, and all pits have already been cleared.
def winner_of(board: list) -> int:
    if board[mancala_of(0)] > board[mancala_of(1)]: return 0 
    elif board[mancala_of(0)] < board[mancala_of(1)]: return 1

# TODO: implement string_of(board)

def string_of(board: list) -> str:
    new_board = pad_all(board)
    return '\n           {} {} {} {} {} {}\n        {}                   {}\n           {} {} {} {} {} {}\n'.format(new_board[12],new_board[11],new_board[10],new_board[9],new_board[8],new_board[7],new_board[13],new_board[6],new_board[0],new_board[1],new_board[2],new_board[3],new_board[4],new_board[5])



-----------------------MinMax AI-------------------------------------------------------------
from os import stat
import numpy as np
from mancala_helpers import *

# A simple evaluation function that simply uses the current score.
def simple_evaluate(state):
    return score_in(state)

# TODO
# Implement a better evaluation function that outperforms the simple one.
def better_evaluate(state):
     #lst = state[1] 
     #return score_in(state)/2
     return None
     
# depth-limited minimax as covered in lecture
def minimax(state, max_depth, evaluate):
    # returns chosen child state, utility

    # base cases
    if game_over(state): return None, score_in(state)
    if max_depth == 0: return None, evaluate(state)

    # recursive case
    children = [perform_action(action, state) for action in valid_actions(state)]
    results = [minimax(child, max_depth-1, evaluate) for child in children]


    _, utilities = zip(*results)
    player, board = state
    if player == 0: action = np.argmax(utilities)
    if player == 1: action = np.argmin(utilities)
    return children[action], utilities[action]

# runs a competitive game between two AIs:
# better_evaluation (as player 0) vs simple_evaluation (as player 1)
def compete(max_depth, verbose=True):
    state = initial_state()
    while not game_over(state):

        player, board = state
        if verbose: print(string_of(board))
        if verbose: print("--- %s's turn --->" % ["Better","Simple"][player])
        state, _ = minimax(state, max_depth, [better_evaluate, simple_evaluate][player])
    
    score = score_in(state)
    player, board = state
    if verbose:
        print(string_of(board))
        print("Final score: %d" % score)
    
    return score


if __name__ == "__main__":
    
    score = compete(max_depth=4, verbose=True)

----------------------------------------playing the game---------------------------
from os import stat
from mancala_helpers import *
from mancala_minimax import minimax, simple_evaluate

def get_user_action(state):
    actions = list(map(str, valid_actions(state)))
    player, board = state
    prompt = "Player %d, choose an action (%s): " % (player, ",".join(actions))
    while True:
        action = input(prompt)
        if action in actions: return int(action)
        print("Invalid action, try again.")

if __name__ == "__main__":

    max_depth = 1
    state = initial_state()
    while not game_over(state):

        player, board = state
        print(string_of(board))
        if player == 0: 
            action = get_user_action(state)
            state = perform_action(action, state)
        else:
            print("--- AI's turn --->")
            #print(string_of(board))
            print(state)
            print(max_depth)
        
            state, _ = minimax(state, max_depth, simple_evaluate)
            #print(string_of(board))
    
    player, board = state
    print(string_of(board))
    if is_tied(board):
        print("Game over, it is tied.")
    else:
        winner = winner_of(board)
        print("Game over, player %d wins." % winner)

score 0 · Answer 1 · answered Oct 23 '21 at 16:24

the entire problem was in the mancala helper file. in the function play_turn() the 2nd line new_board = board. this was causing the issue because the original state should be immutable to work properly. any changes on new_board were also affecting board. the following new_board = copy.deepcopy(board) fixed everything. the function copy.deepcopy() creates a completely new copy, any changes applied to one of them does not affect the other.

Mancala AI completes the game and wins in one round instead of playing one turn

1 Answers1