I am getting NaN for p.value when trying to test the null hypothesis that the mean of the revenue for the phone plan surf is the same as that of the ultimate plan. I don't understand what I am doing wrong. I'm assuming that it may have to do with my DataFrame call_plan_merge
. There are some NaN values in the monthly_revenue
column (not visible in what I posted here). Could that be the reason why? But at the same time the calculated mean (which we can see was calculated properly while ignoring the NaNs from the monthly_revenue
column) is already in the variables used for testing the hypothesis, so I don't understand NaN would be generated for p-value.
Here is my code:
#The average revenue from users of Ultimate and Surf calling plans differs.
average_rev_surf = call_plan_merge.query('tariff == "surf"')
average_rev_surf = average_rev_surf['monthly_revenue'].mean()
average_rev_ultimate = call_plan_merge.query('tariff == "ultimate"')
average_rev_ultimate = average_rev_ultimate['monthly_revenue'].mean()
alpha = 0.05 # critical statistical significance
results = st.ttest_1samp(average_rev_surf, average_rev_ultimate)
print('p-value:', results.pvalue)
if results.pvalue < alpha:
print('We reject the null hypothesis')
else:
print("We can't reject the null hypothesis")
print('Average revenue for the surf plan is: {:.2f}$'.format(average_rev_surf))
print('Average revenue for the ultimate plan is: {:.2f}$'.format(average_rev_ultimate))
Output:
p-value: nan
We can't reject the null hypothesis
Average revenue for the surf plan is: 35.77$
Average revenue for the ultimate plan is: 36.32$
This is what call_plan_merge
looks like:
user_id call_month total_calls duration tariff reg_month churn_month state monthly_revenue
0 1000.0 12.0 16.0 124.0 ultimate 12 13.0 GA 70.00
1 1001.0 8.0 27.0 182.0 surf 8 13.0 WA 20.00
2 1001.0 9.0 49.0 315.0 surf 8 13.0 WA 20.00
3 1001.0 10.0 65.0 393.0 surf 8 13.0 WA 90.09
4 1001.0 11.0 64.0 426.0 surf 8 13.0 WA 60.00
5 1001.0 12.0 56.0 412.0 surf 8 13.0 WA 60.00
6 1002.0 10.0 11.0 59.0 surf 10 13.0 NV 20.00
7 1002.0 11.0 55.0 386.0 surf 10 13.0 NV 60.00
8 1002.0 12.0 47.0 384.0 surf 10 13.0 NV 20.00
9 1003.0 12.0 149.0 1104.0 surf 1 13.0 OK 158.12
Thank you so much for your help!