0

If df$Y is a factor, which level is treated as 1 and which is treated as 0 by default in glm(Y ~ X, data = df, family = binomial)?

It seems like it's not the ref argument in relevel?

fun <- function(){

    test <- data.frame(y = factor(c(rep('a',100),rep('b',100))), x = 1:200)
    t1 <- sample(c('a','b'), 1)

    test[,'y'] <- relevel(test[,'y'], ref = t1)
    co <- glm(y ~ x, data = test, family = binomial)$coefficients[2]

    test[,'y'] <- ifelse(test[,'y'] == t1, 1, 0)
    c1 <- glm(y ~ x, data = test, family = binomial)$coefficients[2]

    return((co > 0) == (c1 > 0))
}


fun()
## F
donald
  • 1
  • 1
  • See `levels(Y)`. If Y has 2 levels, then the first element is treated as 0. In fact, if Y has more than two elements, the first element is still treated as 0 and the remaining elements are aggregated into a single 1. – lmo Apr 01 '17 at 21:40
  • @Imo It seems like it's not the first element but the level which comes first alphabetically. See my edit. Is that true? – donald Apr 01 '17 at 21:56
  • The first element of the output of `levels(Y)`. If of interest, you can reset the base category with `relevel`. – lmo Apr 01 '17 at 21:57
  • @Imo It seems like the argument to `ref` is not treated as the `1` value. Is this correct? See my edit. – donald Apr 01 '17 at 22:25

0 Answers0