Tic-Tac-Toe minimax algorithm doesn't work with 4x4 board

Question

So I've been working on this project for the past 3 weeks now. I managed to get the minimax function to work early on for a 3x3 board, however problems started arising when I tried using it for a 4x4 board, namely Java heap space errors. Since then, with the help of Alpha beta pruning, I've managed to bring down the number of required minimax calls within the minimax function from aprox. 59000 to 16000 to 11000 and then finally to 8000 calls(This is assuming an initial minimax call for a board with one slot already filled). The problem now however, is that the method just keeps running for 4x4 games. It simply keeps calling itself non stop, no errors, no result, nothing. Theoretically, the way I see it, my function should work for arbitrary board sizes, the only problem was memory. Now, since I've reduced the memory greed of my function greatly, I'd expected it to work. Well, it does for the 3x3. However, it doesn't for the 4x4. A brief explanation of what the function does: The function returns an array of size 2 containing the most favorable next move from amongst all the possible next moves along with the score expected to be achieved from that move. The scoring system is simple. +10 for an O win, -10 for an X win and 0 for a draw. The function is of course recursive. Within it you will find certain shortcuts that reduce the number of required calls to itself. For example if it's X's turn and the returned score is -10(which is the best possible score for X) then exit the loop, i.e. stop observing other potential moves from this state. Here's the code for class State:

private String [] state;    //Actual content of the board
private String turn;        //Whose turn it is
private static int n;       //Size of the board

public State(int n) {
    state = new String[n*n];
    for(int i = 0; i < state.length; i++) {
        state[i] = "-";
    }
    State.n = n;
}


public int[] newminimax47(int z) {
    int bestScore = (turn == "O") ? +9 : -9;    //X is minimizer, O is maximizer
    int bestPos = -1;
    int currentScore;
    int lastAdded = z;
    if(isGameOver() != "Not Gameover") {
        bestScore= score();
    }
    else {
        int i = 0;
        for(int x:getAvailableMoves()) {
            if(turn == "X") {   //X is minimizer
                setX(x);
                currentScore = newminimax47(x)[0];
                if(i == 0) {
                    bestScore = currentScore;
                    bestPos = x;
                    if(bestScore == -10)
                        break;
                }
                else if(currentScore < bestScore) {
                    bestScore = currentScore;
                    bestPos = x;
                    if(bestScore == -10)
                        break;
                }
            }
            else if(turn == "O") {  //O is maximizer
                setO(x);
                currentScore = newminimax47(x)[0];
                if(i == 0) {
                    bestScore = currentScore;
                    bestPos = x;
                    if(bestScore == 10)
                        break;
                }

                else if(currentScore > bestScore) {
                    bestScore = currentScore;
                    bestPos = x;
                    if(bestScore == 10)
                        break;
                }
            }
            i++;
        }
    }
    revert(lastAdded);
    return new int [] {bestScore, bestPos};
}

Complementary functions used by newminimax47():

isGameOver():

public String isGameOver() {
    if(n == 3) {
        //Rows 1 to 3
        if((state[0] != "-") && (state[0] == state[1]) && (state[1] == state[2]))
            return (state[0] == "X") ? "X Won" : "O Won";
        else if((state[3] != "-") && (state[3] == state[4]) && (state[4] == state[5]))
            return (state[3] == "X") ? "X Won" : "O Won";
        else if((state[6] != "-") && (state[6] == state[7]) && (state[7] == state[8]))
            return (state[6] == "X") ? "X Won" : "O Won";

        //Columns 1 to 3
        else if((state[0] != "-")&&(state[0] == state[3]) && (state[3] == state[6]))
            return (state[0] == "X") ? "X Won" : "O Won";
        else if((state[1] != "-") && (state[1] == state[4]) && (state[4] == state[7]))
            return (state[1] == "X") ? "X Won" : "O Won";
        else if((state[2] != "-") && (state[2] == state[5]) && (state[5] == state[8]))
            return (state[2] == "X") ? "X Won" : "O Won";

        //Diagonals
        else if((state[0] != "-") && (state[0]==state[4]) && (state[4] == state[8]))
            return (state[0] == "X") ? "X Won" : "O Won";
        else if((state[6] != "-") && (state[6] == state[4]) && (state[4] == state[2]))
            return (state[6] == "X") ? "X Won" : "O Won";

        //Checking if draw
        else if((state[0] != "-") && (state[1]!="-") && (state[2] != "-") && (state[3]!="-") &&
                (state[4] != "-") && (state[5] != "-") && (state[6] != "-") && (state[7] != "-") &&
                (state[8] != "-"))
            return "Draw";
        else
            return "Not Gameover";
    }
    else {
        //Rows 1 to 4
        if((state[0] != "-") && (state[0] == state[1]) && (state[1] == state[2]) && (state[2] == state[3]))
            return (state[0] == "X") ? "X Won" : "O Won";
        else if((state[4] != "-") && (state[4] == state[5]) && (state[5]==state[6]) && (state[6] == state[7]))
            return (state[4] == "X") ? "X Won" : "O Won";
        else if((state[8] != "-") && (state[8] == state[9]) && (state[9]==state[10]) && (state[10] == state[11]))
            return (state[8] == "X") ? "X Won" : "O Won";
        else if((state[12] != "-") && (state[12] == state[13]) &&(state[13] == state[14]) && (state[14] == state[15]))
            return (state[12] == "X") ? "X Won" : "O Won";

        //Columns 1 to 4
        else if((state[0] != "-") && (state[0] == state[4]) && (state[4] == state[8]) && (state[8] == state[12]))
            return (state[0] == "X") ? "X Won" : "O Won";
        else if((state[1] != "-") && (state[1] == state[5]) && (state[5] == state[9]) && (state[9] == state[13]))
            return (state[1] == "X") ? "X Won" : "O Won";
        else if((state[2] != "-") && (state[2] == state[6]) && (state[6] == state[10]) && (state[10] == state[14]))
            return (state[2] == "X") ? "X Won" : "O Won";
        else if((state[3] != "-") && (state[3] == state[7]) && (state[7] == state[11]) && (state[11] == state[15]))
            return (state[3] == "X") ? "X Won" : "O Won";

        //Diagonale
        else if((state[0] != "-") && (state[0] == state[5]) && (state[5] == state[10]) && (state[10] == state[15]))
            return (state[0] == "X") ? "X Won" : "O Won";
        else if((state[12] != "-") && (state[12] == state[9]) &&  (state[9] == state[6]) && (state[6] == state[3]))
            return (state[0] == "X") ? "X Won" : "O Won";

        //Pruefe ob Gleichstand
        else if((state[0] != "-") && (state[1] != "-") && (state[2] != "-") && (state[3]!="-") &&
                (state[4] != "-") && (state[5] != "-") && (state[6] != "-") && (state[7] != "-") &&
                (state[8] != "-") && (state[9] != "-") && (state[10] != "-") && (state[11] != "-") &&
                (state[12] != "-") && (state[13] != "-") && (state[14] != "-") && (state[15] != "-")) 
            return "Draw";
        else
            return "Not Gameover";
    }   
}

Please excuse the bluntness of the isGameOver() method, it merely checks the state of the board( i.e. Win, Draw, Game not Over)

The getAvailableMoves() method:

public int[] getAvailableMoves() {
    int count = 0;
    int i = 0;
    for(int j = 0; j < state.length; j++) {
        if(state[j] == "-")
            count++;
    }
    int [] availableSlots = new int[count];
    for(int j = 0; j < state.length; j++){
        if(state[j] == "-")
            availableSlots[i++] = j;        
    }
    return availableSlots;
}

This method merely returns an int array with all available next moves(regarding the current state) or returns empty array if no moves are available or if game is over.

The score() method:

public int score() {
    if(isGameOver() == "X Won")
        return -10;
    else if(isGameOver() == "O Won")
        return +10;
    else 
        return 0;
}

setO(), setX() and revert():

public void setX(int i) {
    state[i] = "X";
    turn = "O";
    lastAdded = i;
}
public void setO(int i) {
    state[i] = "O";
    turn = "X";
    lastAdded = i;
}
public void revert(int i) {
    state[i] = "-";
    if(turn == "X")
        turn = "O";
    else
        turn = "X";
}

My main method looks like this for a 3x3 game:

public static void main(String args[]) {
    State s = new State(3);
    int [] ScoreAndRecommendedMove = new int[2];
    s.setX(8);
    ScoreAndRecommendedMove = s.newminimax47(8);
    System.out.println("Score: "+ScoreAndRecommendedMove[0]+" Position: "+ ScoreAndRecommendedMove[1]);
}

In this game, X has started the game with a move at position 8. The method in this case will return

Score: 0 Position: 4

Meaning that O's most promising move is at position 4 and in the worst case will yield a score of 0(i.e a draw).

The following image is meant to give an idea of how newminimax47() works. In this case our starting state(board) is given the number 1. Note: The numbers indicate the precedence in creation of the regarded states. 1 was created before 2, 2 was created before 3, 3 was created before 4 and so on.

In this scenario, the score and position eventually returned to state 1 will be

Score: 0 Position: 6

coming from state 8.

Note: The code you see is just snippets of the actual State class. These snippets on their own should allow you to recreate, and play around with, the newminimax47 function without problems(at least for 3x3). Any bugs you may find are not really bugs, they were simply not included in the snippets I copied here and the code should work without them. The lastAdded variable in the setO and setX functions for example is not included in the snippets here but I just realized you don't need it to be able to work with the minimax function, so you can just comment that out.

What is the purpose of the z parameter given to the minmax method? — xXliolauXx, Aug 11 '15 at 07:29
your setX() and setO() methods use a variable 'lastAdded' but it's undefined... there only is a local variable with the same name in newMinimax47() — Stef, Aug 11 '15 at 07:45
With a 3x3 board, there are 9 possibilities for the first move, 8 for the second, and so on for a total of `9! = 362880` total moves. With a 4x4 board, the total number of moves is `16! = 20922789888000`. So there's probably nothing wrong with the algorithm, you just need to run it on a really really really fast computer :) — user3386109, Aug 11 '15 at 08:15
Doesn't score depend on only your current state of board? In that case you can have 3^16 different states ('O', 'X' or empty). You can use memoization to speed up recursion. — xashru, Aug 11 '15 at 14:44

Stef · Accepted Answer · 2015-08-12T08:50:34.397

I played around with your code and there is quite a bit to say

bug

first of all there is a bug. I don't think your code actually works for a 3x3 board. The problem is the location you revert the move you add to the board. You do this at the end of the newminimax47 method exactly once, even though in the method you add moves to the board inside a for loop. This means that calling the method does not only compute something but also changes the board state, and the rest of the code expects it not to.

So remove the revert where it is now and in revert as soon as you can:

setX(x);                                                                                                                                                                                                                                             
currentScore = newminimax47(x)[0];                           
revert(x);

this also implies you don't need the lastAdded variable.

play

It's a lot easier to see what is happening if you actually play against your own algorithm. Add a method to your State class

public void dump() {                                                        
    for (int y = 0; y < n; y++) {                                           
        for (int x = 0; x < n; x++) {                                       
            System.out.print(state[y * n + x]);                             
        }                                                                   
        System.out.println();                                               
    }                                                                       
}

and in you main something like

public void play() {                                                        
    State s=new State(3);                                                   
    Scanner in = new Scanner (System.in);                                   
    while (s.isGameOver().equals("Not Gameover")) {                         
        int[] options = s.getAvailableMoves();                              
        s.dump();                                                           
        System.out.println ("Your options are " + Arrays.toString(options));
        int move = in.nextInt();                                            
        s.setX(move);                                                       
        int [] ScoreAndRecommendedMove=new int[2];                          
        ScoreAndRecommendedMove=s.newminimax47(0);                           
        System.out.println("Score: "+ScoreAndRecommendedMove[0]+" Position: "+ ScoreAndRecommendedMove[1]);
        s.setO(ScoreAndRecommendedMove[1]);                                 
    }                                                                       
    s.dump();                                                               
}

and you can actually play against it. On a 3x3 board this works just fine for me. Unfortunately I estimated computing the first move on a 4x4 takes my computer approximately 48 hours.

data types

Your choice of data types is often a bit... strange. If you want to remember a single character, use char instead of String. If you want to return a yes/no decision, try use boolean. There are also some parts of the program that could be replaces by less code doing the same. But that wasn't your question, so on to...

algorithm

Ok, so what's wrong with minimax to solve this problem? Suppose the first four moves are X5, O8, X6 O7. Another possibility is to start the game with X5, O7, X6, O8. Yet another one is X6, O7, X5, O8. And finally there's X6, O8, X5, O7.

All four of those possibilities for the first four moves of the game lead to the exact same gamestate. But minimax will not recognise they're the same (basically there is no memory of parallel branches) so it will compute them all four. And the number of times each board state is computed will increase rapidly if you search deeper.

The number of possible games vastly outnumbers the number of possible board states. To estimate the number of games: at first there are 16 possible moves, then 15, then 14, 13, ... and so on. A rough estimation is 16!, although minimax won't have to compute all of them, because many of them will have finished before the 16th move.

An estimation for the number of game states is: every square on the board can be either empty, or an X, or an O. So that's 3^16 boards. Not all of them are actually valid boards, because the number of Xs on the board can be at most one more then the number of Os, but still it's close to 3^16.

16! possible games is about half a million times more then 3^16 possible board states. Which means we're approximately computing every board half a milion times in stead of just once.

The solution is to start remembering every gamestate you compute. Every time the recursive function is called first check if you already know the answer, and if so just return the old answer. It's a technique called memoization.

Memoization

I'll describe how to add memoization while using the data structures you already choose (even though I don't agree with them). To do memoization you need a collection on which you can do quick adds and quick lookups. A list (for example ArrayList) won't do us any good. It's fast to add values but to do a lookup is very slow in long lists. There are some options but the easiest one to use is HashMap. In order to use HashMap you need to create something that represents your state and you can use as a key. The most straight-forward is to just make a String with all the X/O/- symbols in it that represent your board.

So add

Map<String,int[]> oldAnswers = new HashMap<String,int[]>();

to your State object.

Then at the start of your newminimax47 method create the String that represents the State and check if we already know the answer:

    String stateString = "";                                                
    for (String field : state) stateString += field;                        
    int[] oldAnswer = oldAnswers.get(stateString);                          
    if (oldAnswer != null) return oldAnswer;

Finally when you compute a new answer the end of newminimax47 you should not only return it, but also store it in the map:

    int[] answer = {bestScore, bestPos};                                    
    oldAnswers.put (stateString, answer);                                   
    return answer;

With memoization in place I was able to play a 4x4 game against your code. The first move is still slow (20 seconds) but after that everything is computed and it's very fast. If you want to speed it up further, you could look into alpha beta pruning. But the improvement won't be near as much as memoization. Another option is using more efficient data types. It won't reduce the theoretical order of your algorithm, but could still easily make it 5 times faster.

Thank you for the enlightening answer. How would you implement it in my program though? What I have in mind right now is to create a field in my State class which would be a list of String arrays where all my created states would be stored. Then every time before newminimax47 is called within itself, I check if there is a duplicate in the list. Would you recommend this? I have a feeling the list would get really big. PS: Your comment on using chars instead of Strings has been duly noted. — Omar Sharaki, Aug 11 '15 at 22:41
Nice answer, for more information on transposition tables and speedgain see my comment on the other answer... — xXliolauXx, Aug 12 '15 at 06:41
@Stef A few notes regarding your answer: It seems memoization is incompatible with alpha beta pruning(i.e. they can't be implemented in the same method at the same time). Also, a minimax method with alpha beta pruning in place seems to run much faster than one with memoization. For more detail see my answer on this [post](http://stackoverflow.com/questions/32154533/how-to-call-a-minimax-methodwith-alpha-beta-pruning-properly) — Omar Sharaki, Aug 23 '15 at 11:32

score 0 · Answer 2 · answered Aug 11 '15 at 10:23

0

As user3386109 explained, the problem here is how many times you calculate everything. There are a few things that may help you out though, considerin an N-sized grid:

a user cannot win if there are less than N of his symbols, so you can use that in the isGameOver function as a first check
the first thing that you have to do is to prevent your opponent from winning if there's a chance that his next move is a winning one
Keep track of how many "X" and "O" are in each row and column and in the two diagonals by incrementing counters every move. If there are N-1 of the same symbol, the next one will be a winning move for either you or your opponent.
By doing so, you can easily tell which is the best move, because then:
- if you have a winning move you put the symbol there
- check your opponent, if he has N-1 symbols on the same row/column/diagonal you put it there
- if your opponent has more symbols than you on some place, you even out the place (that means +1 or +2, depending on who's starting the game)
- if that's not the case, you put your next symbol on the row/colum/diagonal where you have more symbols
- if you have the same number of symbols on some places, you just put it where your opponent has more symbols
- if you and your opponent are entirely even, just go for your own strategy (random would not be bad, I guess :-) )

Unless you really need it (for example as homework), I wouldn't use recursion for this one.

Just as a side note: I don't think it's good practice to have what is actually a boolean function return a string and then compare that with a fixed value. A true/false return value for the isGameOver function looks much better to me.

answered Aug 11 '15 at 10:23

ChatterOne

3,381
1
18
24

1

The beginning of your answer looks good, but I don't see any proof that your "best move" rules really are optimal (and I'm fairly sure they aren't, at least not for general *N*-in-a-row games). In particular, once both you and your opponent have symbols in a given row (or column or diagonal), there's no reason to "even it out"; a mixed row can never win, and the *only* reason to ever play on one is because it coincidentally intersects another row that still may win. – Ilmari Karonen Aug 11 '15 at 10:54
You're right and I'm well aware of it not being optimal, but this was meant just as a suggestion. Feel free to edit it to add your own suggestions :-) – ChatterOne Aug 11 '15 at 11:02
It actually IS possible to search through 16! possibilities, since a proper alpha-beta-algorithm will reduce it to about 5*(16!^0.5). This number of variations can be searched in no time. In addition to that, you can reduce the number even further by mirroring the board to 5*(16!^0.5)/4 = 5*(15!^0.5). – xXliolauXx Aug 11 '15 at 11:17
@xXliolauXx What does my implementation of alpha beta pruning lack. Anything you can suggest to make it better. Because my initial thought was it HAS been done using alpha beta pruning for 4x4 boards so the problem cant be that it isn't do-able. – Omar Sharaki Aug 11 '15 at 17:44
@Omar If you really want to, you could reduce the number to about (16!/8!/8!), resulting in a bit less than 13'000 nodes. If you want to get that result, you will have to implement transposition tables (extremely efficient at tictactoe) and consider that every position can be mirrored/turned. If you want to simply improve the alpha beta, you will need move ordering. Therefore, simply write heuristics about which moves "look good", and search the moves first that do so (easiest to achieve, although not that easy to find good heuristics for tictactoe) – xXliolauXx Aug 12 '15 at 06:35
Could you elaborate on how rotation / transposition reduces the number of possibilities so much? I can see that the first move is irrelevant, and for the next move there are probably 6 possibilities, but after that it doesn't help all that much anymore? Sure 1 * 6 * 14! is a lot less then 16!, but I'm not getting anywhere near the number you're claiming. – Stef Aug 12 '15 at 07:36
@Stef TT does not reduce the pure mathematical number of possibilities, but it does reduce the amount of nodes in the searchtree that actually have to be searched. For further info, I suggest to ask google ( chessprogramming.net has really nice explanaitiin of tts, it's just chess related there). The maximum efficiency here is the amount of possible game states in tictactoe (which is significantly smaller than the amount of games that could be played). Thats the reason for the huge difference. – xXliolauXx Aug 12 '15 at 08:39
@xXliolauXx Thanks for the tips, I really appreciate it. – Omar Sharaki Aug 12 '15 at 20:21
@Omar np, thats what we're here for, I guess. – xXliolauXx Aug 13 '15 at 06:18
Maybe you guys might find my new question interesting @xXliolauXx – Omar Sharaki Aug 19 '15 at 02:41
@Stef Maybe you guys might find my new question interesting – Omar Sharaki Aug 19 '15 at 02:42

Tic-Tac-Toe minimax algorithm doesn't work with 4x4 board

2 Answers2

Linked