AlphaBeta with TT (MTD-f)

Question

My question is about TTalphabeta. In MTD-f's paper (https://people.csail.mit.edu/plaat/mtdf.html) the implementation of alpha beta is quite different from other I found (https://homepages.cwi.nl/~paulk/theses/Carolus.pdf)

What is the difference between them?

My heuristic returns an int so no I got no decimals, should I take any special care?

My implementation know is:

template <class node, class transposition_table>
bound_and_action<node> alpha_beta_with_memory(node& root, depth depth,
        bound alpha, bound beta, transposition_table& table)
{


    auto value_in_hash = table.find(root);

    if (value_in_hash != table.end()
            && value_in_hash->second._depth > depth) { // Transposition table lookup

        auto bound_in_hash = value_in_hash->second;


        if ( bound_in_hash.lower_bound >= beta )
          return { bound_in_hash.lower_bound, root.get_action() };

        if ( bound_in_hash.upper_bound <= alpha )
          return { bound_in_hash.upper_bound, root.get_action() };


        alpha = std::max(alpha, bound_in_hash.lower_bound);
        beta = std::min(beta, bound_in_hash.upper_bound);
    }

    bound_and_action<node> ret;

    if ( depth == 0 || root.is_terminal() ) { // Leaf Node

      ret._bound = root.get_heuristic_value();
      ret._action = root.get_action();

    } else {

         list<node> children;
        root.get_children(children);

        if (root.is_max_node()) {

            ret._bound = INT_MIN;
            bound a = alpha; //Save original alpha

            for (auto child : children) {


                bound_and_action<node> possible_ret = alpha_beta_with_memory(child, depth - 1,
                                                      a, beta, table);

                if (possible_ret._bound == 1000) {
                    return {1000, root.get_action()};
                }

                if (possible_ret._bound > ret._bound ) {
                    ret._bound = possible_ret._bound;
                    ret._action = child.get_action();
                }

                a = std::max(a, ret._bound);

                if ( beta <= ret._bound ) {
                    break;    // beta cut-off
                }

            }

        } else { // if root is a min node.

            ret._bound = INT_MAX;
            bound b = beta; //Save original beta


            for (auto child : children) {


                bound_and_action <node> possible_ret = alpha_beta_with_memory(child, depth - 1,
                                                       alpha, b, table);

                if (possible_ret._bound == 1000) {
                    return {1000, root.get_action()};
                }

                if (possible_ret._bound < ret._bound) {
                    ret._bound = possible_ret._bound;
                    ret._action = child.get_action();
                }

                b = std::min(b, ret._bound);

                if ( ret._bound <= alpha ) {
                    break;    // alpha cut-off
                }

            }
        }
    }

    //
    //  ----- Transposition table storing of bounds.

    hash_struct& hash_value = table[root];

    if (hash_value._depth < depth) {
        // Fail low result implies an upper bound.
        if (ret._bound <= alpha) {
            hash_value.upper_bound = ret._bound;
        }
        // Found an accurate minimax value - will not occur if called with zero window.
        if ( ret._bound > alpha && ret._bound < beta){
          hash_value.lower_bound = ret._bound;
          hash_value.upper_bound = ret._bound;
        }

        // Fail high result implies a lower bound.
        if (ret._bound >= beta ) {
            hash_value.lower_bound = ret._bound;
        }

        hash_value._depth = depth;
        hash_value._action = ret._action;
    }

    return ret;
}

But it doesn't work as good as it should. Everything is fine if no TT is used so something in the use of it must be wrong.

The mtdf version tries to zoom in on the correct score by lots of re-searches with a null window. For it to work well, it has to store **both** an upper bound and a lower bound for each node in the table. Traditional alpha-beta just stores *either* an upper or a lower bound. — Bo Persson, May 24 '18 at 14:46
Using zero-window the value is never between alpha and beta so I don't need both bounds, that what sais the original paper. /* Found an accurate minimax value - will not occur if called with zero window */ if g > alpha and g < beta then n.lowerbound := g; n.upperbound := g; store n.lowerbound, n.upperbound; — Ludvins, May 24 '18 at 15:00
But you do. The zero windows will move back and forth all the time, so if you don't have two bounds in the TT, a lot of the lookups will not be useful. — Bo Persson, May 24 '18 at 15:05
I see, thank you for answering. Then the implementation in the second link is wrong? — Ludvins, May 24 '18 at 15:07
I don't know really. If you are interested, you could check out the [publications of Don Dailey](https://chessprogramming.wikispaces.com/Don+Dailey#Selected%20Publications). He has done some research on adapted versions of mtd-f. — Bo Persson, May 24 '18 at 15:16
I've already changed the algorith to have 2 bounds, if my heuristic is discrete do I have to change any comparation such as the one I said before? — Ludvins, May 24 '18 at 16:38
I experimented with MTD(f) when Aske's thesis was new, about 20(!) years ago. I really don't remember details at that level. :-) — Bo Persson, May 24 '18 at 18:40

AlphaBeta with TT (MTD-f)

0 Answers0