MiniMax with Alpha Beta Pruning for Othello not working

Question

I have the following implementation of a alpha beta minimax for an othello (reversi) game. Somehow, this never really returns the proper action to take. It seems to return the default action I put in the function (0, 0) and the secondary value of -32768, which means it got pruned at the MAX subroutine. Any tips on what I can improve with this and how I can fix this problem?

Note: I've identified the successors being returned properly for the most part. The max depth for now is 8. Computer player's pn (player number) is 1 and the human player's is 0. The first stage, 0, is MINIMAX_MAX. Alpha and beta are initially set to INT_MIN and INT_MAX respectively.

mm_out minimax(Grid& G, int alpha, int beta, Action& A, uint pn, uint depth, bool stage) {
    if (G.check_terminal_state() || depth == MAX_DEPTH) {
#ifdef DEBUG
        cout << "best action: (" << A.get_x() << ", " << A.get_y() << ")\n";
#endif
        return mm_out(A, G.get_utility(pn));
    }

    // add end game score total here

#ifdef DEBUG
    if (stage == MINIMAX_MAX) {
        cout << "max " << alpha << " " << beta << "\n";
    }
    else {
        cout << "min " << alpha << " " << beta << "\n";
    }
#endif

    set<Action> succ_temp = G.get_successors(pn);
    for (Action a : succ_temp) {

#ifdef DEBUG
        cout << a.get_x() << " " << a.get_y() << '\n';
#endif

        Grid gt(G);
        a.evaluate(gt);
    }
    set<Action, action_greater> successors(succ_temp.begin(), succ_temp.end());

#ifdef DEBUG
    Player p(0, "minimaxtest");
    G.display(p);
    int test;
    cin >> test;
#endif

    // if no successor, that player passes
    if (successors.size()) {
        for (auto a = successors.begin(); a != successors.end(); ++a) {
            Grid gt(G);
            gt.do_move(pn, a->get_x(), a->get_y(), !PRINT_ERR);
            Action at = *a;
            mm_out mt = minimax(gt, alpha, beta, at, pn ^ 1, depth + 1, !stage);
            int temp = mt.val;
//          A = mt.best_move;

            if (stage == MINIMAX_MAX) {
                if (alpha < temp) {
                    alpha = temp;
                    A = *a;
#ifdef DEBUG
                    cout << "Current action: (" << A.get_x() << ", " << A.get_y() << ") alpha = " << alpha << "\n";
#endif
                }
                if (alpha >= beta) {
#ifdef DEBUG
                    cout << "pruned at max\n";
#endif
                    return mm_out(A, beta);
                }
            }
            else {
                if (beta > temp) {
                    beta = temp;
                    A = *a;
#ifdef DEBUG
                    cout << "Current action: (" << A.get_x() << ", " << A.get_y() << ") beta = " << beta << "\n";
#endif
                }
                if (alpha >= beta) {
#ifdef DEBUG
                    cout << "pruned at min\n";
#endif
                    return mm_out(A, alpha);
                }


}
    }
    return mm_out(A, (stage == MINIMAX_MAX) ? alpha : beta);
}
else {
    cout << "no successor\n";
    return mm_out(A, (stage == MINIMAX_MAX) ? (std::numeric_limits<int>::max() - 1) : (std::numeric_limits<int>::min() + 1));
}

}

Utility function:

int Grid::get_utility(uint pnum) const {
    if (pnum)
        return wcount - bcount;
    return bcount - wcount;
}

score 2 · Accepted Answer · answered Dec 12 '15 at 23:16

2

You should pass the alpha / beta parameters by value (not by reference):

mm_out minimax(Grid& G, int alpha, int beta, Action& A, uint pn, uint depth, bool stage)

Each node passes the alpha and beta values to its children. The children then update their own copies of the alpha or beta value depending on whose turn it is and return the final evaluation of that node. That is then used to update the alpha or beta value of the parent.

answered Dec 12 '15 at 23:16

manlio

18,345
14
76
126

Thanks for the quick response! That fixes the bug I had. However I've come by a new problem now. It seems to still pick moves that it can't play. I've used the debugs to see what's going on and I saw that during the simulations, it skips some obvious beneficial moves where the action's evaluation is more than any others. (Ex: Playing (1,3) would give 3 more white pieces opposed to any other giving 2) I think this might be a problem. What can I do about this? I've updated the code to reflect my changes. – SenselessCoder Dec 12 '15 at 23:48
I actually managed to fix that problem just now as well. Now I can play with the computer, and even though not very smart at the moment, it's capable of playing. It's a bit slow though, even with MAX_DEPTH as 8. I've read a bit about ways of speeding it up but I'm not sure. What would you suggest in terms of performance boost for this code? – SenselessCoder Dec 13 '15 at 00:08

MiniMax with Alpha Beta Pruning for Othello not working

1 Answers1

Linked