1

I am trying to find the minimum number of people with a sampled birthday in r of 0.9 (90%)

I am trying to do this through sampling and two for loops:

my expected results are around 300 people I think

bus = 2
#start person count at 2
count = 0
# create counter to count birthdays equal to jan first#assume no february 29th leap year
for (i in 1:400){
sims = 1000
for (i in 1:sims) {
  bday = sample(1:365, bus, replace = TRUE) #randomly sample birthdays, producing bus number of samples
  if (bday == 1) { # to see if there are two identical birthdays add to counter
    count= count + 1
  }
}
p= count/sims
print(p)
    if(p<0.9){bus = bus+1}
    if (p >= 0.9){bus = bus}
}
print(bus)
r2evans
  • 141,215
  • 6
  • 77
  • 149
pip
  • 11
  • 1
  • 4
    Hi there, not very clear what you are trying to do. Could you clarify the question? If this is a probability question it likely doesn't need `for` loops. At any rate, if you are to use `for` there are numerous issues with this code in general (i.e., both `for` loops use `i`, you don't actually designate the position in the loop with `i` (e.g.,`bday[i]`), and you don't initialize the variables before the `for` loop. – jpsmith Dec 09 '21 at 14:26

1 Answers1

0

If you are searching for a simulation (answering to this question theoricaly would be far easier as it is a binomial law, you have to find around 839 people in the bus), this one would fit :

  birthday_searched <- sample(1:NDAYS,1)
  
  NDAYS <- 365
  sims <- 10000
  people_number <- 830:850
  probabilities <- c()
  for (i in people_number){
    save <- c()
    for (j in 1:sims) {
      bday = sample(1:NDAYS, i, replace = TRUE) #randomly sample birthdays, producing bus number of samples
      save <- c(save, birthday_searched %in% (unique(bday)))
    }
    m <- mean(save)
    probabilities <- c(probabilities,m)
  }
  names(probabilities) <- people_number
  plot(probabilities,type="l")
  abline(h=0.9, col="red")
  first_ok <- c(first_ok,names(which(probabilities>=0.9))[1])
  sprintf("Number of people in the bus for p=0.9 : %s",names(which(probabilities>=0.9))[1])

Explanation : the bus is full of people with random birthdays. Inside the bus, there is a certain number of unique birthdays. Taking a random date in a classical year, the probability of having that date as a birthday of someone in the bus is 1/number of unique birthdays.

If you want a stronger estimation, you can just add more sims.

EDIT : more comprehensive process

Levon Ipdjian
  • 786
  • 7
  • 14