2

I am trying to use a vector of arbitrary length as an argument in the function, which uses data.table syntax. The vector contains the names of columns to be used as regressors.

data <- data.table(id = c(1,2,3,1,2,3), prd = c(1,1,1,2,2,2), y=rnorm(6), int=1, x1=rnorm(6), x2=rnorm(6))
regressors <- c("x1", "x2")
regressors1 <- c("int", "x1", "x2")
> data
   id prd        y int       x1       x2
1:  1   1 -0.04855   1 -1.27990  0.46696
2:  2   1 -0.39557   1  0.07829 -0.96699
3:  3   1 -0.23865   1 -0.39837 -1.12087
4:  1   2 -0.16012   1 -0.12919  0.22381
5:  2   2  0.21071   1  0.40768 -0.02047
6:  3   2 -0.55270   1 -0.87323  2.43326

The desired output is

output <- data[, as.list(lm.fit(cbind(int, eval(as.name(regressors[1])), eval(as.name(regressors[2]))), data.matrix(y))$coefficients), by = prd]
> output
   prd      int                 
1:   1 -0.42284 -0.3119 -0.05346
2:   2 -0.08158  0.7202  0.06486

I am using lm.fit() instead of lm() to speed up computations. I want to use the vector "regressors" directly into the function, allowing it to have arbitrary length. Hence, I want something like

output <- data[, as.list(lm.fit(cbind(lapply(regressors1, eval(as.name))), data.matrix(y))$coefficients), by = prd]

, but it does not work. After Applying function row-wise in a data.table; passing column names as a vector, I unsuccessfully tried

output <- data[, as.list(lm.fit(cbind(unlist(mget(regressors1))), data.matrix(y))$coefficients), by = prd]

Moysey Abramowitz
  • 352
  • 1
  • 7
  • 19
  • 2
    Could use `.SDcols`. You don't need all the `eval/as.name` calls. Try `data[, as.list(lm.fit(as.matrix(.SD), y)$coefficients), .SDcols = regressors1, by = prd]` – David Arenburg Jun 02 '19 at 08:43

0 Answers0