Evaluate different poker strategies

Question

not sure if SO is the right place to ask this question, but I am gonna try anyway.

I am playing with neural networks and poker and I am facing a problem that is how to evaluate different players. Poker variant I am talking about is No-limit holdem for 6 players. Is there a better way to find out exact (or atleast somehow exact) winrate of players, than to simulate X (ranging from hundreds of thousands to milions) hands? Problem is that simulating milion of hands is kinda time-consuming, since each move means calculating neural network output. Generating all possible hand and board options doesn't seem like a good idea, since there is a LOT of them.

Is it possible to do it better?

Are these trials merely one-shot hands? In other words, the model doesn't take into account the playing style of the opponents, history, etc. ? Since virtually *all* of the advantage in poker programs is from inter-player strategy, I'm not yet sure what you mean by "winrate" [sic]. — Prune, Mar 16 '18 at 18:35
Sorry, should have clarified it. What I mean by that is how much chips each player wins / loses each hand on average. So if player A wins 1000 chips after playing 100 hands, his winrate is 10chips per hand — user3048782, Mar 16 '18 at 18:39

score 0 · Answer 1 · answered Mar 16 '18 at 21:02

Summary:

No way will you want to directly compute this metric.
You will not be able to simulate all possible hands with current computing power.

The main problem is the quantity of variables: not only do you have six two-card hands and five sequential up-cards, but you have to deal with five foreign betting strategies. Unless you know all the details of those strategies, you have no way of directly computing the probability-averaged outcome.

Assuming that you also have adaptive strategies, those adaptations add even more complexity to the computations, such that a 100-hand trial must consider the sequence of hands played -- a "big bang" of combinatorial explosion.

Thus, we seem to be stuck with Monte Carlo methods (e.g. random sampling). Experiment with a few trials to see how many you need to get a reasonable evaluation for your needs. Do you really need 10^6 hands played to do that, or will 100 or 1000 hands give you a good approximation? If you're just trying to train and tune your model, I'm guessing that 20 trials of 100 hands each will be more than you need to get 99% accuracy of your rate of return (win rate).

Evaluate different poker strategies

1 Answers1