Questions tagged [monte-carlo-tree-search]

Monte-Carlo Tree Search is a best-first, rollout-based tree search algorithm. It gradually improves its evaluations of nodes in the trees using (semi-)random rollouts through those nodes, focusing a larger proportion of rollouts on the parts of the tree that are the most promising. This tag should be used for questions about implementation of this algorithm.

Monte-Carlo Tree Search (MCTS) is a best-first tree search algorithm. "Best-first" means that it focuses the majority of the search effort on the most promising parts of a search tree. This is a popular algorithm for automated game-playing in Artificial Intelligence (AI), but also has applications outside of AI.

For an introduction into the types of search trees that MCTS is typically used for, see minimax.

Older algorithms, such as minimax and alpha-beta-pruning, evaluate a node by systematically searching all the nodes below it, which can be very computationally expensive. The basic idea of MCTS is to rapidly approximate such systematic evaluations of nodes by using (semi-)random rollouts and averaging the evaluations at the end of those rollouts. Over time, the algorithm will focus a larger amount of search effort on parts of the search tree that appear promising based on the earlier rollouts. This enables the algorithm to gradually improve the approximations of the evaluations in those parts of the tree. Additionally, the algorithm gradually grows a tree by storing a number (typically 1) of nodes in memory for every rollout. The parts of the tree that are stored in memory are not traversed using a (semi-)random strategy, but using a different ("Selection") strategy. This guarantees that, given an infinite amount of computation time, the algorithm converges to optimal solutions.

An extensive survey, describing many enhancements and applications of the algorithm, can be found in:

  • Browne, C., Powley, E., Whitehouse, D., Lucas, S., Cowling, P. I., Rohlfshagen, P., Tavener, S., Perez, D., Samothrakis, S., and Colton, S. (2012). A Survey of Monte Carlo Tree Search Methods. Computation Intelligence and AI in Games, IEEE Transactions on, Vol. 4, No. 1, pp. 1-43.

The algorithm was originally proposed in:

  • Kocsis, L. and Szepesvári, C. (2006). Bandit Based Monte-Carlo Planning. Proceedings of the 17th European Conference on Machine Learning (ECML 2006) (eds. J. Fürnkranz, T. Scheffer, and M. Spiliopoulou), Vol. 4212 of Lecture Notes in Computer Science, pp. 282-293, Springer Berlin Heidelberg.
  • Coulom, R. (2007). Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search. Computers and Games (eds. H. J. van den Herik, P. Ciancarini, and H. H. L. M. Donkers), Vol. 4630 of Lecture Notes in Computer Science, pp. 72-83, Springer Berlin Heidelberg.
79 questions
2
votes
1 answer

How to understand the 4 steps of Monte Carlo Tree Search

From many blogs and this one https://web.archive.org/web/20160308070346/http://mcts.ai/about/index.html We know that the process of MCTS algorithm has 4 steps. Selection: Starting at root node R, recursively select optimal child nodes until a leaf…
2
votes
1 answer

Java Heap Space Issue with my MCTS Gomoku player

When I run my program I get this error: Exception in thread "AWT-EventQueue-0" java.lang.OutOfMemoryError: Java heap space at MCTSNode.setPossibleMoves(MCTSNode.java:66) at MCTSNode.Expand(MCTSNode.java:167) at…
2
votes
1 answer

MCTS *tree* parallelization in Python - possible?

I would like to parallelize my MCTS program. There are several ways to do this: Leaf parallelization, where each leaf is expanded and simulated in parallel. Root parallelization, where each thread/process creates a separate tree and when some…
2
votes
1 answer

Monte Carlo Tree Search - "most promising" move function

I tried to implement tic-tac-toe hello-world MCTS game player but I encountered a problem. While simulating the game and choosing "the most promising" (exploit/explore) node I only take total wins number into account ("exploit" part) - this causes…
2
votes
1 answer

Monte Carlo Tree Search Improvements

I'm trying to implement the MCTS algorithm on a game. I can only use around 0.33 seconds per move. In this time I can generate one or two games per child from the start state, which contains around 500 child nodes. My simulations aren't random, but…
2
votes
1 answer

Number of simulation per node in Monte Carlo tree search

In the mcts algorithm described in Wikipedia, it performs exactly one playout(simulation) in each node selection. Now, I am experimenting this algorithm in a simple connect-k game. I wonder, in practice, do we perform more playouts to reduce the…
Davis
  • 110
  • 2
  • 8
1
vote
0 answers

Chess Bot not playing to expected level - Monte Carlo Tree Search

I am creating a chess bot for Sebastian Lague's "Tiny Chess Bots" competition. It is using Monte Carlo Tree Search with Upper Confidence Bounds, and the issue is that it is playing very poorly - so much more that I question whether the code is…
1
vote
1 answer

Monte-carlo search tree with hidden information

I'm currently working on a project where I apply a monte-carlo tree search type algorithm on a card game. In the game I choose, there is no drawing of cards: all players are dealt 8 cards and play 1 card by round for 8 rounds. My problem is the…
Qise
  • 192
  • 6
1
vote
1 answer

Monte carlo tree search keeps getting stuck in an infinite loop when playing (as opposed to training)

I have tried to make my own implementation of the Monte Carlo Tree search algorithm for a simple boardgame, and it seems to work reasonable while learning. However when I switch from playing to arena mode for evaluation, the mcts gets stuck in an…
Tue
  • 371
  • 1
  • 14
1
vote
1 answer

When should a monto carlo tree search be reset?

I am trying to build/understand exactly how the monte carlo tree search algorithm (mcts) works in conjunction with a neural network to learn how to play games (like chess). But I'm having trouble understanding when to reset the tree. I have looked…
Tue
  • 371
  • 1
  • 14
1
vote
1 answer

Why is there logarithm (and the square root) in the UCB formula of Monte Carlo Tree Search?

I studied Monte Carlo Tree Search (UCT) from several sources, like this: http://www.incompleteideas.net/609%20dropbox/other%20readings%20and%20resources/MCTS-survey.pdf However, I didn't understand why there is logarithm (and the square root) in the…
1
vote
1 answer

MCTS : RecursionError: maximum recursion depth exceeded while calling a Python object

For this Monte-Carlo Tree Search python coding, why do I have RecursionError: maximum recursion depth exceeded while calling a Python object ? Is this normal for MCTS which needs to keep expanding ? Or did I miss any other bugs which I am still…
kevin998x
  • 899
  • 2
  • 10
  • 23
1
vote
1 answer

Repeated Calls to ProcessPoolExecutor context get slower (Python)

I am developing a MCTS algorithm and I am attempting to parallelize some of the work by rolling out multiple leaves in parallel. After doing a batch of rollouts I then want to go back and add my results to the tree ( undo my virtual losses) before…
1
vote
0 answers

Reinforcement learning with hard constraints

The environment is a directed graph that consists of nodes which have their own "goodness"(marked green) and edges that have prices(marked red). In this environment exists Price(P) constraint. The goal is to accumulate the most "goodness" points…
1
vote
0 answers

Is Monte Carlo Tree Search needed in Partially Observable Environments during Game Play?

I understand that with a fully observable environment (chess / go etc) you can run an MCTS with an optimal policy network for future planning purposes. This will allow you to pick actions for game play, which will result in max expected return from…