2

I have a data frame with name, type, and Turnover per game. A sample of that df is given below.

Name    Type    Turnover per game
kevin   A       5
duke    B       10
jonas   A       12
angly   A       2
anjelo  B       10
wily    A       4
nick    A       8

What I want to do is implement a hypothesis test to check, Type A players have average less turnovers than Type B players..

What I tried :

Firstly, group by Type:

df.groupby('Type').mean()

But I don't know how to implement a hypothesis test to check the above condition.

Georgy
  • 12,464
  • 7
  • 65
  • 73
  • what does this line means? (s['American'], s['Europe']) –  Jan 07 '21 at 07:43
  • after groupby need to implement hypothesis testing to check the above condition –  Jan 07 '21 at 07:43
  • NO. it just returns type A mean. What I want to do is implement the hypothesis test in python to check the above-highlighted state –  Jan 07 '21 at 07:58
  • 2
    I check and honestly no idea, never working with hypothesis test – jezrael Jan 07 '21 at 08:06
  • Me too mate. This is my very first time. –  Jan 07 '21 at 08:08

2 Answers2

1

Hypothesis testing can be done with ttest_ind:

import pandas as pd
from scipy import stats

data = {'Name': ['kevin', 'duke', 'jonas', 'angly', 'anjelo', 'wily', 'nick'],
        'Type': ['A', 'B', 'A', 'A', 'B', 'A', 'A'],
        'Turnover': [5, 10, 12, 2, 10, 4, 8]}
df = pd.DataFrame(data)

t,p = stats.ttest_ind(df.Turnover[df.Type.eq('A')], df.Turnover[df.Type.eq('B')], 
                      equal_var=False, alternative='less')

if p < 0.05:
    print('Type A players have average less turnovers than Type B players')
else:
    print('Null hypothesis (equal means) cannot be rejected.')

In your example, the null hypothesis that type A and B players have equal turnovers will be reject and the alternative hypothesis that type A players have average less turnovers than type B player will be accepted. See the section Interpretation in the above linked Wikepedia article for details.

Stef
  • 28,728
  • 2
  • 24
  • 52
  • Thanks. How to define the null and alternative hypotheses from the above question? Can you kindly mention them? –  Jan 07 '21 at 10:44
  • 1
    H0 = type A and B players have equal turnovers; H1 = type A players have less turnovers than type B player (for one-sided test with alternative 'less') – Stef Jan 07 '21 at 10:50
  • Hi it hard to install scipy 1.6.0. Is there any other method to do this? for alternative parameter requires 1.6.0 version. Lower versions don't contain alternative parameter. –  Jan 07 '21 at 13:27
0

The hypothesis test you have mentioned, if I understand correctly, looks straingtforward.

Get the turnover mean by grouping by 'Type'

df_group_by_type = df.groupby('Type')['Turnover per game'].apply(np.mean)
df_group_by_type

Type
A    6.2 
B    10.0

and then just check the required condition

df_group_by_type['A'] < df_group_by_type['B']
True
ggaurav
  • 1,764
  • 1
  • 10
  • 10