2

Suppose that I have X1,...,X14 potential predictors.

Now for a given Y i want to make the OLS scheme:

Y~X1+X2
Y~X1+X3
 ....
Y~X1+X14
....
Y~X14+X13

which is basically all the by two combinations of all the predictors. After all those regressions are created I want to use them in the predict function (if possible).

My question is: How do i make all those regressions with all by two combinations of the regressors?

duplode
  • 33,731
  • 7
  • 79
  • 150
Hercules Apergis
  • 423
  • 6
  • 20

2 Answers2

3

You can use combn for all the combinations and then use an apply to create all the formulas:

#all the combinations
all_comb <- combn(letters, 2)

#create the formulas from the combinations above and paste
text_form <- apply(all_comb, 2, function(x) paste('Y ~', paste0(x, collapse = '+')))

Output

> text_form
  [1] "Y ~ a+b" "Y ~ a+c" "Y ~ a+d" "Y ~ a+e" "Y ~ a+f" "Y ~ a+g".....

Then you can feed the above formulas into your regression using as.formula to convert the texts into formulas (most likely in another apply).

LyzandeR
  • 37,047
  • 12
  • 77
  • 87
  • can you please explain more..i am not sure how it works? – Hercules Apergis Feb 12 '17 at 20:24
  • `combn` will create a matrix with all the 2-way combinations. Each column will contain a combination. Then we use `apply` which iterates over the columns in order to create the formulas. `paste` creates the text representing the formula. As you can see `text_form` has all the 2 way formulas represented as text. Using `text_form` to run the regressions would require an additional step as described in lmo's answer: `lapply(text_form, function(i) lm(i, data=df))`. – LyzandeR Feb 12 '17 at 20:34
3

You could also put them into formulas in one line like this:

mySpecs <- combn(letters[1:3], 2, FUN=function(x) reformulate(x, "Y"),
                 simplify=FALSE)

which returns a list that can be used in lapply to run regressions:

mySpecs
[[1]]
Y ~ a + b
<environment: 0x4474ca0>

[[2]]
Y ~ a + c
<environment: 0x4477e68>

[[3]]
Y ~ b + c
<environment: 0x447ae38>

You would then do the following to get a list of regression results.

myRegs <- lapply(mySpecs, function(i) lm(i, data=df))
lmo
  • 37,904
  • 9
  • 56
  • 69
  • Hello, I get this error when i run the first part: 'Error in matrix(r, nrow = len.r, ncol = count) : 'data' must be of a vector type, was 'language' ' – Hercules Apergis Feb 12 '17 at 20:15
  • Sorry. I had a typo and forgot to include the simplify=FALSE argument which returns a list in `combn` rather than simplifying to a matrix. – lmo Feb 12 '17 at 21:29