0

I'm trying to implement a computer player in a Connect Four type game. Alpha-beta pruning seemed like the best way to achieve this, but I cannot seem to figure out what I'm doing wrong.

The following is the code I've come up with. It starts with a initial root state. For every possible, valid move (and if no pruning occurs) the algorithm: makes a deep copy of the state, updates the state (increases depth, switches turns, adds a piece, sets a heuristic value), and adds this new state to the root's list of successors.

If the new state is not a leaf (i.e. at max depth) it recursively continues. If it is a leaf, the algorithm checks the root's value and appropriate local alpha/beta value and updates accordingly. After all possible valid options have been checked, the algorithm returns the appropriate local alpha/beta value.

At least, that is what I intended. Every run returns a value of 0. As requested here is the initialization code:

class GameState:

   def __init__(self, parentState = None):

      # copy constructor
      if not(parentState == None):

         self.matrix = copy.deepcopy(parentState.matrix)
         self.successor = copy.deepcopy(parentState.successor)
         self.depth = parentState.depth
         self.turn = parentState.turn
         self.alpha = parentState.alpha
         self.beta = parentState.beta
         self.connects = copy.deepcopy(parentState.connects)
         self.value = parentState.value
         self.algo_value = parentState.value
         self.solution = parentState.solution

      # new instance 
      else:

         # empty board
         self.matrix = [[0 for y in xrange(6)] for x in xrange(7)]

         ## USED WHEN GROWING TREE
         self.successor = [] # empty list
         self.depth = 0 # start at root
         self.turn = 1 # game starts on user's turn

         ## USED WHEN SEARCHING FOR SOLUTION
         self.alpha = float("-inf")
         self.beta = float("+inf")

         self.connects = [0, 0, 0] # connects in state
         self.algo_value = float("-inf")
         self.value = 0 # alpha-beta value of connects
         self.solution = False # connect four

    def alphabeta(root):

       if root.depth < MAX_EXPANSION_DEPTH:

          # pass down alpha/beta
          alpha = root.alpha
          beta = root.beta

          # for each possible move
          for x in range(7):

             # ALPHA-BETA PRUNING

             # if root is MAXIMIZER
             if (root.turn == 2) and (root.algo_value > beta): print "beta prune"

             # if root is MINIMIZER
             elif (root.turn == 1) and (root.algo_value < alpha): print "alpha prune"

             # CANNOT prune
             else:

                # if move legal
                if (checkMove(root, x)):

                   # CREATE NEW STATE
                   root.successor.append(GameState(root))
                   working_state = root.successor[-1]

                   # update state
                   working_state.successor = []
                   working_state.depth += 1
                   working_state.turn = (working_state.turn % 2) + 1
                   cons = dropPiece(working_state, x, working_state.turn)

                   # update state values
                   # MAXIMIZER
                   if working_state.turn == 2:
                      working_state.value = ((cons[0]*TWO_VAL)+(cons[1]*THREE_VAL)+(cons[2]*FOUR_VAL)) + root.value
                      working_state.algo_value = float("-inf")
                   # MINIMIZER
                   else:
                      working_state.value = ((-1)*((cons[0]*TWO_VAL)+(cons[1]*THREE_VAL)+(cons[2]*FOUR_VAL))) + root.value
                      working_state.algo_value = float("inf")

                   # if NOT a leaf node
                   if (working_state.depth < MAX_EXPANSION_DEPTH):

                      # update alpha/beta values
                      working_state.alpha = alpha
                      working_state.beta = beta

                      ret = alphabeta(working_state)

                      # if MAXIMIZER
                      if (root.turn == 2):
                         if (ret > root.algo_value): root.algo_value = ret
                         if (ret > alpha): alpha = ret
                      # if MINIMIZER
                      else:
                         if (ret < root.algo_value): root.algo_value = ret
                         if (ret < beta): beta = ret

                   # if leaf, return value
                   else:
                      if root.turn == 2:
                         if (working_state.value > root.algo_value): root.algo_value = working_state.value
                         if working_state.value > alpha: alpha = working_state.value
                      else:
                         if (working_state.value < root.algo_value): root.algo_value = working_state.value
                         if working_state.value < beta: beta = working_state.value

          if root.turn == 2: return alpha
          else: return beta
fhgshfdg
  • 1
  • 1
  • 2
  • What is that you are facing a problem with ? –  Nov 20 '14 at 05:10
  • The return value is incorrect. It should return the maximum or minimum value depending on the root node, but it does not. – fhgshfdg Nov 20 '14 at 05:48
  • When you check if `working_state` is a leaf node, sometimes you "return" a value by modifying `root.algo_value`, but later you also `return` either `alpha` or `beta`. That is *very* hard to follow. (PS: This function is huge and deeply indented. That provides lots of room for your bug to hide, and makes it hard to spot exactly where indentations end. Consider breaking this up into smaller functions that are individually easier to test.) – Kevin J. Chase Nov 20 '14 at 06:24
  • Can you include the initialization code for the global root node? – Antti Huima Nov 20 '14 at 16:51
  • Edited to include the node (a.k.a. GameState) class – fhgshfdg Nov 20 '14 at 18:28

1 Answers1

0

Solved the issue. In the above algorithm, I check for pruning after the loop has moved on to the next successor (whose default algo_values are the respective max and min values).

Instead, the algorithm should check the first node in each list of successors, update its algo_value, and THEN check for pruning of the rest of the nodes in the list of successors.

fhgshfdg
  • 1
  • 1
  • 2