Summary
Monte Carlo Tree Search is an iterative Reinforcement learning algorithm that finds optimal solutions in a highly multidimensional search space. It is a heuristic in that it does not require any knowledge beyond the “rules of the game”.
Details
The algorithm proceeds in four steps:
-
Selection: a child node is selected with the highest upper confidence bound applied to trees ():
: Score, or number of wins from this node : Number of simulations from this node : Total number of simulations : A constant that is chosen empirically (usually set to )
-
Expansion: a new child node is added to the tree at this optimally reached point
-
Simulation (AKA rollout): the remainder of the process or game is played out
-
Backpropagation: updating score of all nodes to the top of the tree with the result
Ref