0

I am running a logistic regression and I want to control for the country of the respondents. I have 12 countries. I used the "fastDummy" package to create dummies for each country ALL<-dummy_cols(ALL, select_columns = "country") I get something like this: country_Japan 1 1 0 0 0 0 country_Taiwan 0 0 1 1 0 0 country_China 0 0 0 0 1 1
and so on... As you can see, the sum of all variables makes a perfect collinearity. For this reason, I cannot estimate the model. I read that I need to include a variable with 0s as the last country dummy to avoid this collinearity. Is this correct? I included the intercept (a column with 1s) , but it did not help. I would appreciate your suggestions. Thanks

Yacila
  • 13
  • 3
  • 1
    Just turn your country variable into a factor and the regression function automatically turns it into dummies. – deschen Oct 09 '21 at 10:50

1 Answers1

0

Check the remove_first_dummy parameter in the dummy_cols function, i.e. set it to TRUE. This should solve your problem of multicollinearity.

deschen
  • 10,012
  • 3
  • 27
  • 50