-1

My dataframe is composed of accounting variables and a dummy variable that allows me to identify two types of company. I would like to perform a t-test for every column of my dataframe in order to compare the means of the variables between the two types of company.

For the moment I have separated my df into two different df based on the dummy variable and run the following code:

for column_type1, column_type2 in zip(df_type1.columns[1:],df_type2.columns[1:]):
    print(ttest_ind(column_type1,column_type2, equal_var=False, nan_policy='omit'))

However, I'm getting the following error:

TypeError: cannot perform reduce with flexible type

If you know how to solve this or have a better way to do it your help is more than welcome!

Thanks

**** EDIT & SOLUTION ****

I've come along my issue and here the code for it.

for column_type1, column_type2 in zip(df_type1,df_type2):
    print(ttest_ind(df_type1[column_type1],df_type2[column_type2], equal_var=False, nan_policy='omit'))
Pierrot75
  • 113
  • 6
  • is the error in the zip or in the call to test? That might be a good clue, try running each statement separately – E.Serra Aug 24 '18 at 10:38
  • The error is in the call to test. – Pierrot75 Aug 24 '18 at 10:50
  • It seems that the loop get only the label of the column and not the values in the columns. – Pierrot75 Aug 24 '18 at 10:57
  • there you got your bug then, have you tried running the test with an arbitrary input? If that works, then all you have to do is generate that input format from your data. – E.Serra Aug 24 '18 at 10:58

1 Answers1

0
for column_type1, column_type2 in zip(df_type1,df_type2):
print(ttest_ind(df_type1[column_type1],df_type2[column_type2], equal_var=False, nan_policy='omit'))
Pierrot75
  • 113
  • 6