How to use hypothesis testing to compare groups

Question

I have a data frame with name, type, and Turnover per game. A sample of that df is given below.

Name    Type    Turnover per game
kevin   A       5
duke    B       10
jonas   A       12
angly   A       2
anjelo  B       10
wily    A       4
nick    A       8

What I want to do is implement a hypothesis test to check, Type A players have average less turnovers than Type B players..

What I tried :

Firstly, group by Type:

df.groupby('Type').mean()

But I don't know how to implement a hypothesis test to check the above condition.

after groupby need to implement hypothesis testing to check the above condition — , Jan 07 '21 at 07:43
NO. it just returns type A mean. What I want to do is implement the hypothesis test in python to check the above-highlighted state — , Jan 07 '21 at 07:58
I check and honestly no idea, never working with hypothesis test — jezrael, Jan 07 '21 at 08:06

Stef · Accepted Answer · 2021-01-07T09:41:02.900

1

Hypothesis testing can be done with ttest_ind:

import pandas as pd
from scipy import stats

data = {'Name': ['kevin', 'duke', 'jonas', 'angly', 'anjelo', 'wily', 'nick'],
        'Type': ['A', 'B', 'A', 'A', 'B', 'A', 'A'],
        'Turnover': [5, 10, 12, 2, 10, 4, 8]}
df = pd.DataFrame(data)

t,p = stats.ttest_ind(df.Turnover[df.Type.eq('A')], df.Turnover[df.Type.eq('B')], 
                      equal_var=False, alternative='less')

if p < 0.05:
    print('Type A players have average less turnovers than Type B players')
else:
    print('Null hypothesis (equal means) cannot be rejected.')

In your example, the null hypothesis that type A and B players have equal turnovers will be reject and the alternative hypothesis that type A players have average less turnovers than type B player will be accepted. See the section Interpretation in the above linked Wikepedia article for details.

edited Jan 07 '21 at 09:41

answered Jan 07 '21 at 09:24

Stef

28,728
2
24
52

Thanks. How to define the null and alternative hypotheses from the above question? Can you kindly mention them? – Jan 07 '21 at 10:44
1

H0 = type A and B players have equal turnovers; H1 = type A players have less turnovers than type B player (for one-sided test with alternative 'less') – Stef Jan 07 '21 at 10:50
Hi it hard to install scipy 1.6.0. Is there any other method to do this? for alternative parameter requires 1.6.0 version. Lower versions don't contain alternative parameter. – Jan 07 '21 at 13:27

score 0 · Answer 2 · answered Jan 07 '21 at 09:36

The hypothesis test you have mentioned, if I understand correctly, looks straingtforward.

Get the turnover mean by grouping by 'Type'

df_group_by_type = df.groupby('Type')['Turnover per game'].apply(np.mean)
df_group_by_type

Type
A    6.2 
B    10.0

and then just check the required condition

df_group_by_type['A'] < df_group_by_type['B']
True

How to use hypothesis testing to compare groups

2 Answers2