0

Could someone be so kind to help me understand how to use the alpha-beta pruning algorithm? I'm making a game similar to connect four. The only differences is there is no diagonal win, and a player can mark an square at any given time (unless it is already occupied, of course). I think I understand how to code the algorithm, I just think I am using it wrong. What I have been doing is having a for loop that looks something like this

for(i=0; i<size; i++)
    for(j=0; j<size; j++)
        val = alphabeta();
            if(val > max)
                max = val;
                move = set(i,j);
setBoard(move); //sets the to the returned value from alphabeta()

the problem I am having is that the first run of alphabeta returns the max val, so therefore none of the next values are greater, and the board will just be set at board[0][0]. Does anyone know what I am doing wrong?

public int alphabeta(Placement place, int depth, int alpha, int beta, boolean maxPlayer)
{

    Placement p = null;
    if(depth==0 || board.isWinner())
    {
        return evaluate(place, maxPlayer);
    }
    if(maxPlayer)
    {
        int i=0, j=0;
        for(i=0; i<board.size; i++)
        {
            for(j=0; j<board.size; j++)
            {
                if(board.validMove(i,j)&&(board.canGetFour(i,j, opponent)&&board.canGetFour(i,j,player)))
                {
                    board.board[i][j] = opponent;
                    p = new Placement(i, j);
                    alpha = Math.max(alpha, alphabeta(p, depth-1, alpha, beta, false));
                    board.board[i][j] = 0;
                }

                if(beta<=alpha)
                    break;
            }
            if(beta<=alpha)
                break;
        }
        return alpha;
    }
    else
    {
        int i=0, j=0;
        for(i=0; i<board.size; i++)
        {
            for(j=0; j<board.size; j++)
            {
                if(board.validMove(i,j)&&(board.canGetFour(i,j,opponent)&&board.canGetFour(i,j,player)))
                {
                    board.board[i][j] = player;
                    p = new Placement(i, j);
                    beta = Math.min(beta, alphabeta(p, depth-1, alpha, beta, true));
                    System.out.println(board);
                    board.board[i][j] = 0;
                }
                if(beta<=alpha)
                    break;
            }
            if(beta<=alpha)
                break;
        }
        return beta;
    }
}

This is the function that makes the move

public void makeMove()
{
    int max = -1;
    Placement p = null;
    int val = -1;
    for(int i=0; i<size; i++)
        for(int j=0; j<size; j++)
        {   
            if(board.validMove(i, j))
            {
                if(board.canGetFour(i, j, opponent)||(board.canGetFour(i,j,player)&&board.canGetFour(i,j,opponent)))
                {
                    board.board[i][j] = player;
                    val = alphabeta(new Placement(i,j), 5, -5000, 5000, true);
                    board.board[i][j] = 0;
                    if(val > max)
                    {
                        max = val;
                        p = new Placement(i, j);
                    }
                }
            }
        }
    board.board[p.row][p.col] = player;
    board.moves++;
}

So here's my updated code, still not working

public Placement alphabeta(Placement p)
{
    int v = max(p,6,-500000, 500000);
    return successors(v);
}
public int max(Placement p, int depth, int alpha, int beta)
{
    if(depth == 0 || board.isWinner())
    {
        return evaluateMax(p,player);
    }
    int v = -500000;
    for(int i=0; i<successors.size(); i++)
    {
        Placement place = new Placement(successors.get(i));
        board.board[place.row][place.col] = player;
        v = Math.max(v, min(place, depth-1, alpha,beta));
        board.board[place.row][place.col] = 0;
        if(v>= beta)
            return v;
        alpha = Math.max(alpha, v);
    }
    return v;
}
public int min(Placement p, int depth, int alpha, int beta)
{
    if(depth == 0||board.isWinner())
    {
        return evaluateMax(p,opponent);
    }
    int v = 500000;
    for(int i=0; i<successors.size(); i++)
    {
        Placement place = new Placement(successors.get(i));
        board.board[place.row][place.col] = opponent;
        v = Math.min(v, max(place,depth-1, alpha,beta));
        board.board[place.row][place.col] = 0;
        if(v<= alpha)
            return v;
        beta = Math.min(alpha, v);
    }
    return v;
}
public void makeMove()
{
    Placement p = null;
    for(int i=0; i<successors.size(); i++)
    {
        Placement temp = successors.get(i);
        //board.board[temp.row][temp.col] = player;
        p = alphabeta(temp);
        //board.board[temp.row][temp.col] = 0;
    }
    System.out.println("My move is "+p.row + p.col);
    board.board[p.row][p.col] = player;
    successors.remove(p);
}

I changed the algorithm slightly just so I can clearly see what going on with min and max, however, it still does not play correctly

Dave Smith
  • 177
  • 1
  • 3
  • 11
  • what are you using alpha-beta for in this context? – Zack Newsham Dec 01 '14 at 01:09
  • I am trying to make a 4-in-arow game. I use the alphabeta pruning algorithm from the psuedocode on wikipedia. Assuming the alphabeta algorithm is correct, how do you actually use it to make a move? – Dave Smith Dec 01 '14 at 01:15
  • 1
    You're going to need to post your actual code if you want help finding the bug. Are you trying to make a game where the computer moves? – Zack Newsham Dec 01 '14 at 01:21
  • Thanks for your help Zack. The placement class is simply a set of two integers to set the row and col of the board. The oppononent and player variables are just used to mark the board with either an X or O – Dave Smith Dec 01 '14 at 01:32
  • what does `canGetFour` do? At a glance it would appear that your function will always return `alpha` because I doubt that the first move of any game will satisfy `canGetFour` (assuming that function checks whether you can get four in a row/column with the current move?) – Zack Newsham Dec 01 '14 at 01:37
  • you are exactly right! canGetFour returns a boolean stating whether placing a mark in that spot, by that player will actualy have the possiblity to ever get four in a row. I hope that makes sense – Dave Smith Dec 01 '14 at 01:41
  • ok, so the first thing you'll need to do is to fix that. I'm still fuzzy on the purpose of the alphabeta function here. Is it looking for the best move that can be made at any given time? Just a note about code repetition. It looks like the two halves of your alphabeta function are identical, except for using true/false in the recursive call and player/opponent. In which case, remove the if/else and use an inline if statement (e.g. `if maxPlayer ? opponent : player`) makes your code much more readable – Zack Newsham Dec 01 '14 at 01:44
  • I agree, however there is one more difference with setting the alpha and beta variable to either min or max. What exactly do you mean when you say fix the canGetFour. Is it not needed? As far as the alphabeta method, it is suppose to look a certain amount of steps ahead (depth) and determine which would be the best move – Dave Smith Dec 01 '14 at 01:53
  • When I say fix the canGetFour, what I mean is thats not the only reason you'd make a move right. In fact, you will always make a move. You're trying to determine the best move to make. So, canIGetFour is your top priority. canOpponentGetFour is second. Then perhaps canIGetThree...etc. until eventually none of those are true, right? – Zack Newsham Dec 01 '14 at 02:19
  • I can't post a comment this long, I'll submit some pseudo code as an answer – Zack Newsham Dec 01 '14 at 02:20
  • Oh I see, my intention for canIGetFour was not not even bother trying if I could never win in that space or if the opponent can never win in that space. perhaps it is not needed. This algorithm is only playing offenxe, and not very well. I just want to know if the way I am using the algorithm is correct, I mean how I am using it in the makeMove method – Dave Smith Dec 01 '14 at 02:23
  • Ok, I had to put a lot more work into that than I thought. You've got a lot of work I'm afraid :) – Zack Newsham Dec 01 '14 at 02:56
  • Thank you for that, I am trying it out now. I think my issue is the evaluation function. Do each successor or the entire board? I was adding a value for each move, but it doesn't seem to make the best move. I give it 4 if there is 2 in a row, 32 if there is 3 in a row, and 127 if there is 4 in a row. I give the 3 in a row 64 if there is 3 in a row with blank spaces on each side. Am i on the right track? – Dave Smith Dec 01 '14 at 17:24
  • you are on the right track. It may be best to start off with forcing maxDepth = 1, that will make the algo choose the best move looking only one move ahead, rather than trying to find the best move looking all the way to the end – Zack Newsham Dec 01 '14 at 17:50

1 Answers1

0

Ok, took some time but I think I have it.

In your evaluate function, you should be returning how good the state is for the actual player. If placement is a canGetFour for the "otherPlayer", that is a bad state (the worst state). So you return a small number. However, if placement is a canGetFour for the "actualPlayer" you return a large number (its a good state).

Then in your makeMove, you are just checking whether the state is the best state possible. Note, that using a 2d array for this is just about the least efficient way of storing the "child nodes". It would make a lot more sense to have a placement.getPossibleMoves() which returns an array of all the empty squares (both real and temporary), and iterate over that. Otherwise your algorithm is going to be exponential time in the order of the board size.

private Placement bestNext;
private List<Placement> tempMoves = new ArrayList<>();
private int alpha;
private int beta;

public int alphabeta(Placement place, int depth, boolean maxPlayer)
{
    Placement p = null;
    if(depth == maxDepth){/* (unnasigned squares in actual board) */
       return evaluate(place, maxPlayer)
    }
    int i=0, j=0;
    for(i=0; i<board.size; i++)
    {
        for(j=0; j<board.size; j++)
        {
            if(board.validMove(i,j)){
                p = new Placement(i, j);
                tempMoves.add(placement);
                int tmp = Math.max(alpha, alphabeta(p, depth += 1, actualPlayer.getOpponent()));
                if(maxPlayer){
                  alpha = tmp
                }
                else{
                  beta = tmp
                }
                tempMoves.remove(placement);
            }

            if(beta<=alpha)
                break;
        }
        if(beta<=alpha)
            break;
    }
    return maxPlayer ? alpha : beta;
}
Zack Newsham
  • 2,810
  • 1
  • 23
  • 43
  • I added my new code based off of your suggestions, my alpha beta alogithm is slightly diefferent because I want to get a better idea of whats going on, but I still have no luck – Dave Smith Dec 02 '14 at 02:29