Algorithm to Make a Set of Random Outcomes Approach a Specific Percentage

Question

Currently, I have a pool of basketball players where I have a projected total of points for each player. Additionally, I have a normal distribution function that gives me a random drawing from a normal distribution for each player. Currently, I have an algorithm that calculates n unique random lineups of 8 players based on some constraints. Between each lineup, the normal distribution function runs again to produce new predictions for each player. Then the best lineup is produced for that specific set of predictions.

I would like to tweak this algorithm in the following way. I would like to have 4 tiers of maximum and minimum percentages where each player is assigned a tier. Within the number of lineups generated, I would like each specific player to occur with that frequency. So for example if I wanted to generate 10 lineups and player 1 is in tier 1 which requires the player to be between 50-60%, then the player would occur in 5-6 lineups ideally.

I'm struggling with how to modify my current algorithm to include this stipulation. Any thoughts would be greatly appreciated! I just don't know how to force each player within a specific range of percentages.

Sure. A lineup is simply a list of 8 players as strings based on their names. So for example, the lineup might be Stephen Curry, Russell Westbrook, James Harden, Trey Lyles, Kyle O'Quinn, LeBron James, Jimmy Butler, and Isaiah Thomas. However, there are other constraints that apply to each lineup such as what positions each player plays, etc but I have already taken those into account. — bballboy8, Feb 13 '18 at 18:04

score 1 · Answer 1 · answered Feb 13 '18 at 18:29

1

There are a lot of ways to do it.

Here is an easy approach. Keep a current relative odds of being picked for each player. The actual probability is the relative odds divided by the sum of the odds. Each person starts with the expected number of times be selected. Whenever someone is selected, their relative odds is reduced by 1. If it goes below 0, that person is out of the pool.

This approach guarantees that each player will not be in more than a maximum number of teams. It makes it unlikely, but not impossible, that any given player will be in fewer teams than you want.

An easy way to solve that is to randomly round people's desired frequencies up and down to get the right integer count. And now everything has to come even.

There is yet another problem, though. Which is that it is possible that you'll not succeed in assignment to fill all the teams. But if you go from the most popular player to the least, the odds of such mistakes should be acceptably low. Doubly so if you widen the ranges slightly by populating a few extra teams, then throwing away ones that didn't work out.

answered Feb 13 '18 at 18:29

btilly

43,296
3
59
88

When trying to implement it this way in the past, one issue that I was running into is that the former teams will contain a player while the latter teams will not contain that player since that player will be eliminated leading to an unsmooth and less random assignment of players. Is there a way to fix this? – bballboy8 Feb 13 '18 at 18:33
Also I think it is very possible that this is only bounding it above, not guaranteeing that a player will reach a minimum percentage. – bballboy8 Feb 13 '18 at 18:34
It is only bounding above. But if you do the rounding up/down to integers, then it bounds below and guarantees an exact assignment. Albeit with a small possibility that some teams at the end will want the same player twice. If you do it right, some players will bunch at the beginning, others at the end, and it is random. – btilly Feb 13 '18 at 18:54
Ideally, I would like to avoid this bunching. I don't want there to be a trend between player A and player B occurring in the same lineups. That was my original intent of making the normal distribution function. – bballboy8 Feb 13 '18 at 18:56
Obviously there is going to be a group of players if they all have a high percentage that will overlap frequently but the remaining should not be trending together and should be more evenly dispersed over the lineups based on the normal distribution. – bballboy8 Feb 13 '18 at 19:04
@bballboy8 If it is truly random, our eyes will think that there are patterns. This is how things SHOULD be. Our brains are very bad at recognizing random for what it is. – btilly Feb 13 '18 at 19:22
Concrete example. Flip a fair coin 100 times. On average there will be around 6 clumps of 5 in a row the same. (6 in a row counts twice.) Only a few percent of the time will there be no such clumps. Yet those clumps look too big to us to be random. – btilly Feb 13 '18 at 19:28
Alright thanks for your help. I will try to implement that approach and see if it produces the results I'm looking for. – bballboy8 Feb 13 '18 at 19:30
From what you said above, I don't understand how I would handle a case like this. If I want player A to occur between 50-60% of lineups and I wanna make 20 lineups, I see that I just need to set his expected number of times to 12. However, how do I make sure it is at least 10? – bballboy8 Feb 13 '18 at 23:34
@bballboy8 On each run you first decide the exact integer number of times that each player will appear. If no player appears more often than that, then by the pigeon hole principle none appears less often than that as well. – btilly Feb 13 '18 at 23:47
So in this case that number would be 12? – bballboy8 Feb 13 '18 at 23:51
I guess I'm just not understanding since if I made 100 players 50-60% and generated only 2 lineups, how would the algorithm cope? – bballboy8 Feb 13 '18 at 23:55
@bballboy8 You have to first randomly decide how many each player gets so that it winds up an integer each, and the number of times a player gets assigned to a team matches the number of spots on teams that need to be filled. If you try to get players to be assigned more times than there are spots for them, that first step fails. – btilly Feb 14 '18 at 01:03
I'm considering the scenario where there are more players than teams to fill rather than more teams than players to fill. The latter will never happen. – bballboy8 Feb 14 '18 at 01:07

score 0 · Answer 2 · answered Feb 13 '18 at 18:13

First draft

So if I understand correctly, you have N players that might appear in the first position of the string. But you want them to be selected not at random, but according to some percentage.

Now the first step is to normalize those percentages:

Alice 20%
Bob 40%
Charlie 10%
Doug 60%
Eric 30%

The sum is 160%, so you generate a random number from 1 to 160; say it's 97.

97 is more than 20, so subtract 20 and ignore Alice.
77 is more than 40, so subtract 40 and ignore Bob.
37 is more than 10, so subtract 10 and ignore Charlie.
27 is less than 60: Doug it is.

You can also pre-populate a 160-element array with 20 "Alice" indexes, 60 "Doug" indexes etc., and your player is players[array[random(160)]].

I think you're understanding the concept. I think what you provided is something where each team is only 1 player and in this particular instance you only generated one lineup. And since the highest percentage is Doug, he would be selected. However, I would like each player to be able to have a range. For example, Doug would be 20-30%, Bob might be 40-50% etc. But in one lineup, those ranges wouldn't matter. — bballboy8, Feb 13 '18 at 18:16

Algorithm to Make a Set of Random Outcomes Approach a Specific Percentage

2 Answers2

First draft