2

I am doing a linear regression and I would like to fix some inputs. I have found the way to do this with offset. Let's see it in example:

set.seed(145)
df <- data.frame(a = rnorm(10), b = rnorm(10), c = rnorm(10), d = rnorm(10))

summary(lm(formula = a ~ . + offset(0.1*c) - c + offset(0.05*d) - d, data = df))

The problem is that I have much more variables and I would like to generate my lm formula automatically.

Let's say, I want to pass the names of inputs (that are columns of data in lm) and a value for it's coefs, for example in the next way:

inputs_fix <- c("c", "d")
inputs_fix_coef <- c(0.1, 0.05)

Then I need a function that writes me a formula as above but I don't know how to write an expression offset(0.1*c) - c + offset(0.05*d) - d having inputs_fix and inputs_fix_coef objects.

Is it possible? There is another way to fix coefficients (more elegant)? Appreciate any help

UPDATE: creating formula with paste and as.formula with @Jan van der Laan suggestion

my.formula <- paste0(" + offset(", inputs_fix_coef, "*", inputs_fix, ") - ", inputs_fix, collapse = " ")
lm.fit <- lm(formula = as.formula(paste0("a ~ .", mi.expresion)), data = df))

It isn't so clear but it saves all the inputs into lm object lm.fit$model that are lost in @Jan van der Laan answer. And don't need to duplicate a data.frame

Andriy T.
  • 2,020
  • 12
  • 23

1 Answers1

3

One way of handling this would be to calculate a new column with your total offset and remove the columns used in your offset from the data set:

# create copy of data withou columns used in offset
dat <- df[-match(inputs_fix, names(df))]

# calculate offset
dat$offset <- 0
for (i in seq_along(inputs_fix)) 
  dat$offset <- dat$offset + df[[inputs_fix[i]]]*inputs_fix_coef[i]

# run regression
summary(lm(formula = a ~ . + offset(offset) - offset, data = dat))

It is also always possible to generate your formula as a character vector (using paste etc) and then convert is to formula object using as.formula, but I suspect the solution above is cleaner.

Jan van der Laan
  • 8,005
  • 1
  • 20
  • 35
  • Thank you, it works! I have found a way to create a formula after your suggestion. Add it in update – Andriy T. Jun 24 '15 at 10:27
  • 1
    You can also use matrix multiplication instead of a loop: `dat$offset <-as.matrix(df[inputs_fix]) %*% inputs_fix_coef` – Hong Ooi Jun 24 '15 at 10:45