I have data where I'm modeling a binary dependent variable. There are 5 other categorical predictor variables and I have the chi-square test for independence for each of them, vs. the dependent variable. All came up with very low p-values.
Now, I'd like to create a chart that displays all of the differences between the observed and expected counts. It seems like this should be part of the scipy chi2_contingency function but I can't figure it out.
The only thing I can think of is that the chi2_contingency function will output an array of expected counts, so I guess I need to figure out how to convert my cross tab table of observed counts into an array and then subtract the two.
## Gender & Income: cross-tabulation table and chi-square
ct_sex_income=pd.crosstab(adult_df.sex, adult_df.income, margins=True)
ct_sex_income
## Run Chi-Square test
scipy.stats.chi2_contingency(ct_sex_income)
## try to subtract them
ct_sex_income.observed - chi2_contingency(ct_sex_income)[4]
Error I get is "AttributeError: 'DataFrame' object has no attribute 'observed'"
I'd like just an array that shows the differences.
TIA for any help