0

When I run the following

df <- data.frame(place = c("South","South","North"),
                 temperature = c(30,30,20),
                 outlookfine=c(TRUE,TRUE,FALSE)
                 )
glm.fit <- glm(outlookfine ~ .,df, family=binomial)

glm.fit

The output is

Call:  glm(formula = outlookfine ~ ., family = binomial, data = df)

Coefficients:
(Intercept)   placeSouth  temperature  
     -23.57        47.13           NA  

Degrees of Freedom: 2 Total (i.e. Null);  1 Residual
Null Deviance:      3.819 
Residual Deviance: 3.496e-10    AIC: 4

Why is North missing?

[Update] I added "East" and now North appears. How does R Choose which is the base case?

I am checking out the docs

Kirsten
  • 15,730
  • 41
  • 179
  • 318
  • 1
    North is not missing. The model is treating it as the baseline case ("North" comes alphabetically before "South"). The coefficient is the effect of South relative to North. – jdobres May 20 '21 at 21:48
  • 1
    I believe R treats your place as a dummy variable. Everything will be in reference to the "North" term, so placeSouth is the difference in the north and south terms. It is technically fitting two lines. – Josh May 20 '21 at 21:53
  • Thank you both. I feel like this question should have been asked somewhere before, but I can't even find an explanation in the docs. – Kirsten May 20 '21 at 22:25
  • I asked another newbie question here ~https://stackoverflow.com/questions/67628712/in-glm-why-are-some-coeeficients-na-even-when-the-data-is-given – Kirsten May 20 '21 at 22:26

0 Answers0