Calculate a prediction interval for a dataset Python

Question

I have the following table:

perc
0   59.98797
1   61.89383
2   61.08403
3   61.00661
4   62.64753
5   62.18118
6   60.74520
7   57.83964
8   62.09705
9   57.07985
10  58.62777
11  60.02589
12  58.74948
13  59.14136
14  58.37719
15  58.27401
16  59.67806
17  58.62855
18  58.45272
19  57.62186
20  58.64749
21  58.88152
22  54.80138
23  59.57697
24  60.26713
25  60.96022
26  55.59813
27  60.32104
28  57.95403
29  58.90658
30  53.72838
31  57.03986
32  58.14056
33  53.62257
34  57.08174
35  57.26881
36  48.80800
37  56.90632
38  59.08444
39  57.36432

consisting of various percentages.

I'm interested in creating a probability distribution based on these percentages for the sake of coming up with a prediction interval (say 95%) of what we would expect a new observation of this percentage to be within.

I initially was doing the following, but upon testing with my sample data I remembered that CIs capture the mean, not a new observation.

import scipy.stats as st
import numpy as np
  
# Get data in a list
lst = list(percDone['perc'])
  
# create 95% confidence interval
st.t.interval(alpha=0.95, df=len(lst)-1,
              loc=np.mean(lst),
              scale=st.sem(lst))

Thanks!

Calculate a prediction interval for a dataset Python

0 Answers0