This is a simple question but I couldn't find a clear and compelling answer anywhere. If I have a regression model with one or more interaction terms, like:
mod1 <- lm(mpg ~ factor(cyl) * factor(am), data = mtcars)
coef(summary(mod1))
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 22.900000 1.750674 13.080673 0.0000000000006057324
## factor(cyl)6 -3.775000 2.315925 -1.630018 0.1151545663620229670
## factor(cyl)8 -7.850000 1.957314 -4.010599 0.0004547582690011110
## factor(am)1 5.175000 2.052848 2.520888 0.0181760532676256310
## factor(cyl)6:factor(am)1 -3.733333 3.094784 -1.206331 0.2385525615801434851
## factor(cyl)8:factor(am)1 -4.825000 3.094784 -1.559075 0.1310692573417492068
what is a sure fire way of identifying which coefficient estimates are for interaction terms? The obvious way is to grep()
for the colon symbol in the term names. But let's assume for a second that's not possible because of something like:
mtcars$cyl2 <- factor(mtcars$cyl, levels = c(4,6,8), labels = paste("Cyl:", unique(mtcars$cyl)))
mod2 <- lm(mpg ~ cyl2 * factor(am), data = mtcars)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 22.900000 1.750674 13.080673 0.0000000000006057324
## cyl2Cyl: 4 -3.775000 2.315925 -1.630018 0.1151545663620229670
## cyl2Cyl: 8 -7.850000 1.957314 -4.010599 0.0004547582690011110
## factor(am)1 5.175000 2.052848 2.520888 0.0181760532676256310
## cyl2Cyl: 4:factor(am)1 -3.733333 3.094784 -1.206331 0.2385525615801434851
## cyl2Cyl: 8:factor(am)1 -4.825000 3.094784 -1.559075 0.1310692573417492068
I thought perhaps the terms()
object would be useful but it isn't. I could also probably make some assumption about the ordering/numbering of terms to get the intended result:
coef(summary(mod2))[5:6,]
## Estimate Std. Error t value Pr(>|t|)
## cyl2Cyl: 4:factor(am)1 -3.733333 3.094784 -1.206331 0.2385526
## cyl2Cyl: 8:factor(am)1 -4.825000 3.094784 -1.559075 0.1310693
but I don't know how to do that in a general way.
What can be done?