0

I am trying to conduct a test of statistical significance in a dataset with multiple categorical variables, each of which combine into different population sizes. Here is an example of what I mean:

Hair Color    Eye Color    Shoe Size    # of Respondants    Population Size    Response Rate
Brown         Blue         12           3                   10                 0.3
Brown         Blue         13           4                   20                 0.2
Brown         Green        12           5                   5                  1.0
Brown         Green        13           8                   10                 0.8    
Black         Blue         12           2                   20                 0.1
Black         Blue         13           5                   10                 0.5
Black         Green        12           2                   10                 0.2
Black         Green        13           4                   20                 0.2

I am hoping to find the impact of each of the categorical variables on the Response Rate, both independently and in combination. As I understand it, a 3-Way ANOVA is a good way to determine this. However, from what I've read online, it looks like that test is set up to interpret a dependent variable that exists separately for each respondent for each combination of categories, rather than a single combined value such as response rate. Is there either a way to format my data/the ANOVA test to make this a useful test to run, or a different kind of test that would be more appropriate in this context?

If it's helpful, I'm working on this in Python. Thanks!

  • Even though you want to do this in python, fundamentally it's a statistics question and you're probably better of asking this here: https://stats.stackexchange.com/. Having said that, you should be able to expand this into single-subject data. Since the outcome is either 0 or 1 (it seems), you're probably looking for a logistic regression. – LukasNeugebauer May 19 '22 at 14:30
  • 1
    Honestly forgot there was a stats-specific one, thanks for the advice. Also that's a good idea, just expand it into dummy rows. Thanks so much! – user19154333 May 19 '22 at 14:44
  • If anyone finds this, follow-up here: https://stats.stackexchange.com/questions/575894/n-way-anova-for-participation-rate-data – user19154333 May 19 '22 at 15:21

0 Answers0