Minimax for Connect Four (Unity3D and C#): Problems

Question

I'm developing a Connect Four game with AI in Unity3D (C#). I use for this the MiniMax algorithm according to this (German) Pseudocode.

The AI is still playing pretty bad. It tries to get even 4 in a row, although there are only three free fields. The AI always goes through the row and blocks only when it is needed. Unfortunately, it also blocks not always.

Where to hide the problems? What have I forgotten?

How can I integrate random moves of the AI when no one wins or loses at the next move.

Here is my source code:

minimaxDepth = 4

function call: Max (minimaxDepth, fieldCopy);

int Max (int depth, int[,] fieldCopy)
{
    int maxValue = -9999;
    int moveValue;
    bool winAI = false;
    bool winHuman = false;
    bool isStoneBelow = false;

    for (int y = 0; y < 6; y++) {
        for (int x = 0; x < 7; x++) {
            if (y > 0) {
                //is a stone under it?
                if (fieldCopy [x, y - 1] == 1 || fieldCopy [x, y - 1] == 2) {
                    isStoneBelow = true;
                } else {
                    isStoneBelow = false;
                }
            } else {
                isStoneBelow = true;
            }

            // possible move?
            if (fieldCopy [x, y] != 1 && fieldCopy [x, y] != 2 && isStoneBelow == true) {   
                isStoneBelow = false;
                fieldCopy [x, y] = 2; //simulate move
                winAI = false;
                winHuman = false;

                //Is there a winner?
                if (CheckWin (x, y, 2, fieldCopy)) {
                    winAI = true;
                    winHuman = false;
                }

                //No more moves possible?
                if (depth <= 1 || winAI == true) { 
                    moveValue = evaluationFunction (winAI, winHuman);       //evaluate the move
                } else {
                    moveValue = Min (depth - 1, fieldCopy);
                }

                fieldCopy [x, y] = 0; //Reset simulated move

                if (moveValue > maxValue) {
                    maxValue = moveValue;
                    if (depth == minimaxDepth) {
                        aiMoveX = x; // next move
                    }
                }
            }
        }
    }
    return maxValue;
}

int Min (int depth, int[,] fieldCopy)
{
    int minValue = 9999;
    int moveValue;
    bool winAI = false;
    bool winHuman = false;
    bool isStoneBelow = false;
    bool loopBreak = false;

    for (int y = 0; y < 6; y++) {
        for (int x = 0; x < 7; x++) {
            if (y > 0) {
                //is a stone under it?
                if (fieldCopy [x, y - 1] == 1 || fieldCopy [x, y - 1] == 2) {
                    isStoneBelow = true;
                } else {
                    isStoneBelow = false;
                }
            } else {
                isStoneBelow = true;
            }

            // possible move?
            if (fieldCopy [x, y] != 1 && fieldCopy [x, y] != 2 && isStoneBelow == true) {   
                isStoneBelow = false;
                fieldCopy [x, y] = 1; //simulate move
                winHuman = false;
                winAI = false;

                //Is there a winner?    
                if (CheckWin (x, y, 1, fieldCopy)) {
                    winHuman = true;
                    winAI = false;
                }

                //No more moves possible?
                if (depth <= 1 || winHuman == true) {  
                    moveValue = evaluationFunction (winAI, winHuman);       //evaluate the move
                } else {
                    moveValue = Max (depth - 1, fieldCopy);
                }

                fieldCopy [x, y] = 0; //Reset simulated move

                if (moveValue < minValue) {
                    minValue = moveValue;
                }
            }
        }
    }
    return minValue;
}


int evaluationFunction (bool winAI, bool winHuman)
{
    if (winAI) { 
        return 1;
    } else if (winHuman) {
        return -1;
    } else {
        return 0;
    }
}

Thanks for your help!

TL;DR. You're asking us to debug something but give it a try yourself first. Construct the path to the first move on paper and see where the actual code deviates. — H H, Aug 05 '13 at 13:32
The first question is a general question not a question to my source code and I debug my source code repeatedly. Due to the recursion, it is not so easy. — Oli, Aug 05 '13 at 13:42
No, debugging recursive code certainly isn't easy...but necessary in this case. Given the code, as it is written, you should be able to figure out EXACTLY what the AI's next move should be given any initial state you care to setup. Knowing that, and seeing what move it actually makes, will tell you a Lot. Your job is to figure out, not just why it isn't doing what it should, but why it IS doing whatever it's doing. — Nevyn, Aug 05 '13 at 15:13
Regarding (1): Imagine your AI is constrained by time, meaning it can't test every move. How would you evaluate a board where 2 AI's had both placed 2 tokens? A good heuristic will assign some nominal value to things like contiguous chains of tokens. That's why you use large values for wins, because those are game enders and should always trump any move. Regarding (2), agree with Nevyn, Oli, and Henk. — Jerdak, Aug 05 '13 at 18:34
Thank you, Jerdak. It may be that you would have seen a bug in the code at first attempt. I have now added the pseudocode, which I have used for my source code. — Oli, Aug 06 '13 at 07:20

score 4 · Answer 1 · answered Aug 20 '13 at 15:25

Some time ago, I also used Connect 4 as an example for a minimaxing algorithm and experimented with the trade-off between evaluation function and search depth. I found to my surprise that you could achieve pretty reasonable play (for this game) with almost no search but a strong heuristic for the evaluation function. This seemed to perform much better than a weak heuristic for the evaluation function and a deep search.

Getting 4 in a row is the winning condition for the game, so to achieve that you first have to achieve either 3 in a row (vertically, horizontally or diagonally) or a pattern like O-OO or OO-O (horizontally or diagonally). The more of these 'potential 4s in a row' the higher the evaluation score should be. Smaller scores can be contributed to the evaluation function by getting 2 in a row. Also it is an advantage to place your counter towards the center of the board as there is greater potential to form a line of 4, so an evaluation function should also reward counters that are close to the center. Other refinements are also possible based on the state of play.

For example, an important consideration is that you can force a win if you have two of these potential winners next to each other in the same column of the board. You can force your opponent to block the first one and then win by playing your stone above it in that same column.

If your evaluation function encodes these kinds of ideas then you should get some reasonable play when combined with mini-maxing.

How about that: For every potential 4-in-a-row on the board (not obstructed by enemy pieces), assign some score according to how many own pieces are still missing. Sum all those values. I think that captures (and maybe generalizes) what you are explaining. It includes bonus for center positions implicitly. — ziggystar, Aug 21 '13 at 11:59
Yes, I like the way you're thinking. Probably has the same results as what I suggested, but your description captures it more elegantly. Might still be worth including the forced-win-down-a-column heuristic so that adjacent potential 4s-in-a-row are given a higher evaluation than separate ones, although even without it as part of the evaluation function, I think the mini-maxing should be capable of discovering it and acting on it as a winning sequence of moves, should it occur. — Simon White, Aug 21 '13 at 16:08

d3dave · Accepted Answer · 2013-08-16T22:19:02.233

I believe the problem lies in your evaluation function. Instead of evaluating whether one of the players has won, you should evaluate their state in the game. A possible consideration would be the length of the longest contiguous chain of that players pieces. This accurately shows how far each player is from winning, instead of returning the same value even when a player is about to win, but hasn't yet.

Here's pseudocode for clarity:

int evaluationFunction (int depth)
{
    // max_ai_chain and max_human_chain are global variables that 
    // should be updated at each piece placement

    if (depth % 2 == 0) // assuming AI gets calculated on even depth (you mentioned your depth is 4)
    { 
        return max_ai_chain;
    } else {
        return max_human_chain;
    }
}

Minimax for Connect Four (Unity3D and C#): Problems

2 Answers2