Programming a probability to allow an AI decide when to discard a card or not in 5 card poker

Question

I am writing an AI to play 5 card poker, where you are allowed to discard a card from your hand and swap it for another randomly dealt one if you wish. My AI can value every possible poker hand as shown in the answer to my previous question. In short, it assigns a unique value to each possible hand where a higher value correlates to a better/winning hand.

My task is to now write a function, int getDiscardProbability(int cardNumber) that gives my AI a number from 0-100 relating to whether or not it should discard this card (0 = defintely do not discard, 100 = definitely discard).

The approach I have thought up of was to compute every possible hand by swapping this card for every other card in the deck (assume there are still 47 left, for now), then compare each of their values with the current hand, count how many are better and so (count / 47) * 100 is my probability.

However, this solution is simply looking for any better hand, and not distinguishing between how much better one hand is. For example, if my AI had the hand 23457, it could discard the 7 for an K, producing a very slightly better hand (better high card), or it could exchange the 7 for an A or a 6, completing the Straight - a much better hand (much higher value) than a High King.

So, when my AI is calculating this probability, it would be increased by the same amount when it sees that the hand could be improved by getting the K than it would when it sees that the hand could be improved by getting an A or 6. Because of this, I somehow need to factor in the difference in value from my hand and each of the possible hands when calculating this probability. What would be a good approach to achieve this with?

sounds like you should not care about the card itself, but the complete hand, so you would need to evaluate first the posible 4 card hands (considering all the 5 different 4 cards hands) and the odds to get the card you need for each. Then evaluate the current score against that "chance" score. — LordNeo, Mar 09 '17 at 12:48
You lost me a bit at "evaluate the possible 4 card hands (considering all the 5 different 4 card hands)" - by "the possible 4 card hands" do you mean all the possible combinations of 4 cards in a deck? And where are the "5 different 4 card hands" coming from? — KOB, Mar 09 '17 at 13:02
Check out [expected value](https://en.wikipedia.org/wiki/Expected_value). You may want to discard the card that gives you the highest expected value (if it is higher than your current hand's value). — Nico Schertler, Mar 09 '17 at 13:02
@KOB you have 5 cards in hand, so you have 5 "4-cards" combinations, so first you evaluate your current hand, then all those 4-cards hands and then the odds of getting converting those 4-cards hands into a better-than-current 5 card hand. — LordNeo, Mar 09 '17 at 13:06
@NicoSchertler I have just posted a similar question regarding expected values and its probelms with what I am trying to achieve. The main issue is that the function needs to return an integer 0-100 which will act as a guideline for my AI's decision. If I'm using expected values, ould I just find the card(s) to discard that produce the highest positive expected value and tell my AI to 100% discard these cards? http://math.stackexchange.com/questions/2178086/relationship-between-an-expected-value-and-a-probability?noredirect=1#comment4482329_2178086 — KOB, Mar 09 '17 at 13:07
@LordNeo oh ok, I see. So for each of the five 4 cards hands, I would add every card left in the deck as the fifth card, count how many of these, now 5 card hands, are better than what I currently have, then choose whichever has the highest count? — KOB, Mar 09 '17 at 13:10
@KOB a bit easier, you can count the odds of getting the needed card instead of comparing to every card in the deck, so you will end with an array of "current hand / 100% chance / X score - hand 1(4 cards discarding card a) / X% chance / Y score - hand 2(4 cards discarding card b) / X% chance / Y score - hand 3(4 cards discarding card c / X% chance / Y score - ....". that should be a lot more easy to evaluate and know if it's worth the risk — LordNeo, Mar 09 '17 at 13:15
@LordNeo and then in terms of my function needing to return a 0-100 probability, once I have my chosen card to discard from my previous comment, I count how many of these possible hands are better than my current hand, divide by the total number of these possible hands (better or worse) and multiply by 100% to get my probability. How does all that sound? — KOB, Mar 09 '17 at 13:15
@KOB yup, it's sounds better and less convoluted than my previous comment xD — LordNeo, Mar 09 '17 at 13:16
Make sure your random number generator is truly random. A good real life poker player can spot an imperfect card generator and use it to their advantage given enough hands. — Guy Coder, Mar 09 '17 at 13:44
You don't describe the game you're playing other than to say that it's poker, so that makes it difficult to give an answer. Likely you want the AI to make the choice that maximizes its expectation in the game. That is going to be somewhat similar to, but not the same, as maximizing the expectation of the hand rank. To do that, you need to know the probability distribution of the other player's final hands -- which depends on their strategies and the cards that you hold. I have a blog post on developing an AI for a poker-like game here: http://paulhankin.github.io/ChinesePoker/ — Paul Hankin, Mar 09 '17 at 14:22
It's very unlikely that your AI needs a non-deterministic strategy to play near-perfectly, so the business about scores from 0 to 100 is a distraction. Just pick the card which has the highest expectation. — Paul Hankin, Mar 09 '17 at 14:32

Paul Hankin · Answer 1 · 2017-03-09T16:09:19.537

Games in general have a chicken-egg problem: you want to design an AI that can beat a good player, but you need a good AI to train your AI against. I'll assume you're making an AI for a 2-player version of poker that has antes but no betting.

First, I'd note that if I had a table of probabilities for win-rate for each possible poker hand (of which there are surprisingly few really different ones), one can write a function that tells you the expected value from discarding a set of cards from your hand: simply enumerate all possible replacement cards and average the probability of winning with the hands. There's not that many cards to evaluate -- even if you don't ignore suits, and you're replacing the maximum 3 cards, you have only 47 * 46 * 43 / 6 = 16215 possibilities. In practice, there's many fewer interesting possibilities -- for example, if the cards you don't discard aren't all of the same suit, you can ignore suits completely, and if they are of the same suit, you only need to distinguish "same suit" replacements with "different suits" replacement. This is slightly trickier than I describe it, since you've got to be careful to count possibilities right.

Then your AI can work by enumerating all the possible sets of cards to discard of which there are (5 choose 0) + (5 choose 1) + (5 choose 2) + (5 choose 3) = 1 + 5 + 10 + 10 = 26, and pick the one with the highest expectation, as computed above.

The chicken-egg problem is that you don't have a table of win-rate probabilities per hand. I describe an approach for a different poker-related game here, but the idea is the same: http://paulhankin.github.io/ChinesePoker/ . This approach is not my idea, and essentially the same idea is used for example in game-theory-optimal solvers for real poker variants like piosolver.

Here's the method.

Start with a table of probabilities made up somehow. Perhaps you just start assuming the highest rank hand (AKQJTs) wins 100% of the time and the worst hand (75432) wins 0% of the time, and that probabilities are linear in between. It won't matter much.

Now, simulate tens of thousands of hands with your AI and count how often each hand rank is played. You can use this to construct a new table of win-rate probabilities. This new table of win-rate probabilities is (ignoring some minor theoretical issues) an optimal counter-strategy to your AI in that an AI that uses this table knows how likely your original AI is to end up with each hand, and plays optimally against that.

The natural idea is now to repeat the process again, and hope this yields better and better AIs. However, the process will probably oscillate and not settle down. For example, if at one stage of your training your AI tends to draw to big hands, the counter AI will tend to play very conservatively, beating your AI when it misses its draw. And against a very conservative AI, a slightly less conservative AI will do better. So you'll tend to get a sequence of less and less conservative AIs, and then a tipping point where your AI is beaten again by an ultra-conservative one.

But the fix for this is relatively simple -- just blend the old table and the new table in some way (one standard way is to, at step i, replace the table with a weighted average of 1/i of the new table and (i-1)/i of the old table). This has the effect of not over-adjusting to the most recent iteration. And ignoring some minor details that occur because of assumptions (for example, ignoring replacement effects from the original cards in your hand), this approach will give you a game-theoretically optimal AI, as described in: "An iterative method of solving a game, Julia Robinson (1950)."

score 0 · Answer 2 · answered Mar 09 '17 at 13:02

0

A simple (but not so simple) way would be to use some kind of database with the hand combination probabilities (maybe University of Alberta Computer Poker Research Group Database).

The idea is getting to know each combination how much percentage of winning has. And doing the combination and comparing that percentage of each possible hand.

For instance, you have 5 cards, AAAKJ, and it's time to discard (or not).

AAAKJ has a winning percentage (which I ignore, lets say 75)
AAAK (discarting J) has a 78 percentage (let's say).
AAAJ (discarting K) has x.
AAA (discarting KJ) has y.
AA (discarting AKJ) has z.
KJ (discarting AAA) has 11 (?)..
etc..

And the AI would keep the one from the combination which had a higher rate of success.

answered Mar 09 '17 at 13:02

javier_domenech

5,995
6
37
59

1

he can only discard 1 card, so he need to evaluate the 5 cards hand, then each 4-card combination and then the possible outcomes. – LordNeo Mar 09 '17 at 13:10
yep if that's the case, only compare the 4-card-hand combinations. – javier_domenech Mar 09 '17 at 13:12
1

I can actually discard 0-3 cards, I just used 1 to keep my explanation simple, and then I could alter and apply the solution for discarding 1 card for discarding 2 and 3 cards – KOB Mar 09 '17 at 14:09

score 0 · Answer 3 · answered Mar 09 '17 at 13:30

Instead of counting how many are better you might compute a sum of probabilities Pi that the new hand (with swapped card) will win, i = 1, ..., 47.

This might be a tough call because of other players as you don't know their cards, and thus, their current chances to win. To make it easier, maybe an approximation of some sort can be applied.

For example, Pi = N_lose / N where N_lose is the amount of hands that would lose to the new hand with ith card, and N is the total possible amount of hands without the 5 that the AI is holding. Finally, you use the sum of Pi instead of count.

Wow, vivoconunxino ninja'd me :). – S. Kalabukha Mar 09 '17 at 13:32 — S. Kalabukha, Mar 09 '17 at 13:32

Programming a probability to allow an AI decide when to discard a card or not in 5 card poker

3 Answers3