regarding building regression models including interaction effects in lm

Question

I have a data set read as follows

test<-read.csv("data.csv",sep=",",header=T)

There are 10 predictor variables. The first column is response variables

x<-test[,-c(1)]
y<-test[,1]

If I would like to test a model with the first three predictor variables including their interaction terms, here is what I did with lm

test.model<-lm(y~x[,1]*x[,2]*x[,3], data=test)

But it turns out that the the resulting model also includes the interaction term of x[, 1]:x[, 2]:x[, 3] How can I limit the model with just two factor interactions, such as x[, 1]:x[, 2], x[, 2]:x[, 3] and x[, 1]:x[, 3]

If I would like to consider all 10 predictor variables, instead of writing x[,1]*x[,2]*x[,3]*x[,4]*...x[,10], are there convienent ways to write this formula?

score 0 · Accepted Answer · answered Nov 15 '14 at 21:03

0

You can specify the highest order of interactions with ^.

y ~ (x[,1] + x[,2] + x[,3]) ^ 2

results in all two-variable interactions and main effects.

answered Nov 15 '14 at 21:03

Sven Hohenstein

80,497
17
145
168

score 0 · Answer 2 · answered Nov 15 '14 at 21:08

Two points. It makes no sense to extract the predictor and response as separate items if you are also going to supply a data argument. At worst it will start to fail at strange moments, but at a minimum it will confuse your collaborators. It's going to be much easy to interpret if you have meaningful column names.

As Sven points out you can use the "^" formula operator which means something quite different than exponentiation. I'm pretty sure this is a duplicate SO question so will now do a bit of searching.

regarding building regression models including interaction effects in lm

2 Answers2