0

I have a train data table in R, which always have different columns, for example now the data table has the following column names:

library(mgcv)
dt.train <- c("DE", "DEWind", "DESolar", "DEConsumption", "DETemperature", 
              "DENuclear", "DELignite")

Now I want to fit a Generalized Additive Model (= GAM) with integrated smoothness estimation that predicts the DE price. At the moment I fit the model as the following:

fitModel <- mgcv::gam(DE ~ s(DEWind)+s(DESolar)+s(DEConsumption)+s(DETemperature)+
                           s(DENuclear)+s(DELignite), 
                      data = dt.train)

The column names are currently hard-coded, but I don't want to change this all the time, I would like to let the program recognize how many columns there are and fit the model with the existing columns. So, I would like to have something like this (which works for stats::lm() or stats::glm()):

fitModel <- mgcv::gam(DE ~ .-1, data = dt.train)

Unfortunately, this doesn't work with gam().

aynber
  • 22,380
  • 8
  • 50
  • 63
MikiK
  • 398
  • 6
  • 19

1 Answers1

1

I don't recommend you do this for statistical reasons, but…

nms <- c("DE", "DEWind", "DESolar", "DEConsumption", "DETemperature", 
              "DENuclear", "DELignite")
## typically you'd get those names as
## nms <- names(dt.tain)

## identify the response
resp <- 'DE'
## filter out response from `nms`
nms <- nms[nms != resp]

Create the right hand side of the formula, by pasting on the s( and ) bits, and concatenating the strings separated by +:

rhs <- paste('s(', nms, ')', sep = '', collapse = ' + ')

which gives us

> rhs
[1] "s(DEWind) + s(DESolar) + s(DEConsumption) + s(DETemperature) + s(DENuclear) + s(DELignite)"

Then you can add on the response and ~:

fml <- paste(resp, '~', rhs, collapse = ' ')

which gives

> fml
[1] "DE ~ s(DEWind) + s(DESolar) + s(DEConsumption) + s(DETemperature) + s(DENuclear) + s(DELignite)"

Finally coerce to a formula object:

fml <- as.formula(fml)

which gives

> fml
DE ~ s(DEWind) + s(DESolar) + s(DEConsumption) + s(DETemperature) + 
    s(DENuclear) + s(DELignite)
Gavin Simpson
  • 170,508
  • 25
  • 396
  • 453