I am running a logistic regression and I want to control for the country of the respondents. I have 12 countries. I used the "fastDummy" package to create dummies for each country
ALL<-dummy_cols(ALL, select_columns = "country")
I get something like this:
country_Japan 1 1 0 0 0 0
country_Taiwan 0 0 1 1 0 0
country_China 0 0 0 0 1 1
and so on...
As you can see, the sum of all variables makes a perfect collinearity. For this reason, I cannot estimate the model.
I read that I need to include a variable with 0s as the last country dummy to avoid this collinearity. Is this correct? I included the intercept (a column with 1s) , but it did not help.
I would appreciate your suggestions. Thanks
Asked
Active
Viewed 194 times
0

Yacila
- 13
- 3
-
1Just turn your country variable into a factor and the regression function automatically turns it into dummies. – deschen Oct 09 '21 at 10:50
1 Answers
0
Check the remove_first_dummy
parameter in the dummy_cols
function, i.e. set it to TRUE
. This should solve your problem of multicollinearity.

deschen
- 10,012
- 3
- 27
- 50