I attempted to write some code to create a bootstrap distribution and, although it compiles, I'm not sure it is working correctly. Some background: A student at the school where I teach has been systematically finding the combination to the locks on the laptops in our computer lab to screw with our computer teacher (who is, fortunately, not me). Each lock has three entries with the numbers 0-9. I calculate that there are 10^3 possible combinations per lock. He kept detailed lists of combinations he has already tried for each lock so each successive attempt samples one combination without replacement. I am trying to simulate this to get an idea of how many attempts he made to unlock all of these computers (there are 12 computers in lab) by finding an expected value for the number of times it would take to unlock one. This sounds like a hypergeometric distribution to me. The code I wrote is:
import numpy as np
def lock_hg(N):
final_counts = []
for i in range(N):
count = 1
combs = list(np.arange(1,1001,1))
guess = np.random.randint(1,1000)
for k in range(1000):
a = np.random.choice(combs, 1)
if a == guess:
final_counts.append(count)
break
else:
count = count + 1
combs.remove(a)
return(final_counts)
The histogram plt.hist(final_counts) when lock_hg(1000) is called looks fairly uniform with 40 or 50 attempts being just as common as 900 or 950. I thought it would look more like a normal distribution centered at 500. I'm not sure if there is a problem with the code or I am just misunderstanding the math. Is this code appropriate for the problem? If not, how can I fix it? If it is working, is there a more efficient way to do this and, if so, what is it?