First-Visit vs Every-Visit Monte Carlo

Question

I have recently been looking into reinforcement learning. For this, I have been reading the famous book by Sutton, but there is something I do not fully understand yet.

For Monte-Carlo learning, we can choose between first-visit and every-visit algorithm, and it can be proved that both converges to the right solution asymptotically. But I guess that there are a difference between both (I understand the difference by definition, but I do not understand what are the drawbacks of each method). Should I in some cases use first-visit, and sometimes last visit ?

Thanks a lot, Djaz

score 0 · Answer 1 · answered Feb 08 '22 at 11:37

From my personal experience I have noticed first visit monte carlo converges faster and for control problems gets the optimal policy in fewer iterations.

I'm not sure if there exists a mathematical analysis on the rate of convergence of the two, but they both will converge to the true mean due to the law of large numbers.

First-Visit vs Every-Visit Monte Carlo

1 Answers1