I am new to matplotlib and statistics. Trying to learn through the below example and need some help in terms of understanding and solution.
I have added a bar chart image below. I have sample data for four years 1992, 1993, 1994, and 1995. I have plotted 4 bars for their Mean and Margin of Error. Further, I am allowing the user to draw a rectangle to select a range on the y-axis. This is shown as the grey horizontal rectangle in the image with ymax=46132 and ymin=37527. Now the task is to compare each bar with this y-axis range and evaluate if the probability of each distribution’s value falling within the selected range on y-axis and accordingly colour the bar based on the colour map at the bottom.
I have used the following code to find the probability for this but it's not showing the correct results. df2 has 4 rows containing the mean and standard deviation for each bar. ymax=46132 and ymin=37527.
import pandas as pd
import matplotlib.cm as cm
import scipy.stats as st
cmap = cm.get_cmap('Reds')
df2 = pd.DataFrame(data=[[33312.107476, 200630.901553, 6508.897970],
[41861.859541, 98398.356203, 3192.254314],
[39493.304941, 140369.925240, 4553.902287],
[47743.550969, 69781.185469, 2263.851744]],
columns=['mean', 'std', 'MoE'],
index=['1992', '1993', '1994', '1995'])
ymax = 46132
ymin = 37527
for i in range(len(df2)):
cdf_value = (st.norm(df2.iloc[i]['mean'], df2.iloc[i]['std']).cdf(ymax) -
st.norm(df2.iloc[i]['mean'], df2.iloc[i]['std']).cdf(ymin))
print(cdf_value)
clr_shade = cmap(cdf_value)
Below is the output cdf values. All are close to 0 and hence cmap faching the light colour for all bars. As per my understanding, with the current y-axis range in the image, bar for the 1993 should plot with dark colour (should have higher probability), 1992 and 1995 with light colour (with lower probability) and 1994 may be with in-between colour.
0.017093796658858795
0.03487664518128952
0.024448867322311274
0.048988004652986805
Please help me to understand what am I doing wrong and how to solve this.