-1

I'm trying to calculate the confidence interval for a list with 1000 numbers and turn it into a tuple with two variables. however instead of getting a tuple with two variables, I get a tuple of two arrays containing 1000 intervals each. this is my code:

def bootstrap(list):
"""in this line I made 1K lists with 16 numbers that was randomly picked"""
randomize = [[random.choice(list) for _ in list] for _ in range(1000)]
""" after that I used list comprehension and numpy to calculate mean and get 1 list with 1K means"""
means = [np.mean([i for i in sublist]) for sublist in randomize]
```then I tried to create two variable that each one is a sole number that represents the interval```
ci_left, ci_right = tuple(stats.t.interval(0.95, df =len(means) -1 , loc = means))
return (ci_left, ci_right)

but my output is something like this:

(array([-1.33077651, -1.30684806, -1.35418851, -1.32454884, -1.31485041,
   -1.28670879, -1.32344893, -1.38127905, -1.35198733, -1.33957749]),array([2.59390641, 2.61783486, 2.57049441, 2.60013409, 2.60983251,
   2.63797414, 2.60123399, 2.54340387, 2.57269559, 2.58510543,
   2.58198925, 2.56551404, 2.57899741, 2.59180679, 2.56566707,]))

Example for the output I want to get:

(0.607898431, 0.611159753)

Any kind of help is appreciated!

ran bar
  • 75
  • 5

1 Answers1

0

the problem was I used the means variable instead doing mean to the means by suming it and divide by it len, also I also needed to add a scale, this is the answer:

def bootstrap(list):
"""in this line I made 1K lists with 16 numbers that was randomly picked"""
randomize = [[random.choice(list) for _ in list] for _ in range(1000)]
""" after that I used list comprehension and numpy to calculate mean and get 1 list with 1K means"""
means = [np.mean([i for i in sublist]) for sublist in randomize]
ci_left, ci_right = tuple(stats.t.interval(0.95, df =len(means) -1 , loc = sum(means)/len(means) , scale = scipy.stats.sem(means)))
return (ci_left, ci_right)
ran bar
  • 75
  • 5