0

I am running a nonlinear regression model that needs initial values to start, but the number of variables I want to include may be too large to manually type all the values - therefore I was wondering if there's an alternative to that.

set.seed(12345)
y = rnorm(100, 1000,150)
x1 = rnorm(100,10000,251)
x2 = rnorm(100, 3000,654)
x3 = rnorm(100, 25000,100)
x4 = rnorm(100, 200000,589)
x5 = rnorm(100, 31657,296)

adstock <- function(x,rate=0){
  return(as.numeric(stats::filter(x=log(x+1),filter=rate,method="recursive")))
}

library(minpack.lm)

nlsLM(y~b0  
      + b1 * adstock(x1, r1)
      + b2 * adstock(x2, r2)
      + b3 * adstock(x3, r3)
      + b4 * adstock(x4, r4)
      + b5 * adstock(x5, r5)
      
      , algorithm = "LM"

# this is where I need to paste the results from the loop 
      , start = c(b0=1,b1=1,b2=1,b3=1,b4=1,b5=1
                  ,r1=0.1,r2=0.1,r3=0.1,r4=0.1,r5=0.1
                  )
# end
      
      , control = list(maxiter = 200)
)

My idea was to use a loop to pass the values to the model, but I can't make it work (the following code should be for b_i coefficients)

test_start <- NULL

for(i in 1:(5+1)) {
  test_start[i] = paste0("b",i-1,"=",1)
}

cat(test_start)

This is the result, which is not exactly what the model expects:

b0=1 b1=1 b2=1 b3=1 b4=1 b5=1

How can I pass the results of the loop to the model? Also, how can I add r_i start coefficients to b_i start coefficients in the loop? Any help would be very appreciated.

PS: at the moment I am interested to assign to each b0,b1,...,b5 the same value (in this case, 1) and to each r1,r2,...,r5 the same value (in this case, 0.1)

GNicoletti
  • 192
  • 2
  • 17

1 Answers1

1

Define the data as DF and the formula as fo and then grep out the b and r variables. The line defining v creates a vector with their names and the line defining st a named vector with value 1 for the b's and 0.1 for the r's.

DF <- data.frame(y, x1, x2, x3, x4, x5)

n <- ncol(DF) - 1
rhs <- c("b0", sprintf("b%d * adstock(x%d, r%d)", 1:n, 1:n, 1:n))
fo <- reformulate(rhs, "y")

v <- grep("[br]", all.vars(fo), value = TRUE)
st <- setNames(grepl("b", v) + 0.1 * grepl("r", v), v)
st

nlsLM(fo, DF, start = st, algorithm = "LM", control = list(maxiter = 200))

Regarding the comment try defining rhs like this. In the first line take whatever subset of labs you want, e.g. labs <- labels(...)[1:9] or change the formula in the first line, e.g. labs <- labels(terms(y ~ .*(1 + x1), data = DF))

labs <- labels(terms(y ~ .^2, data = DF))
labs <- sub(":", "*", labs)
n <- length(labs)
rhs <- c("b0", sprintf("b%d * adstock(%s, r%d)", 1:n, labs, 1:n))
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
  • excellent point! but now I have an extra question: since I cannot type the formula anymore (due to the object "fo"), what is the best way to add interactions [eg: .... + b6 * adstock(x1*x2,r6) + b7*adstock(x1*x3,r7) + etc...] ? should I create an x6 variable in DF containing x1*x2? or you have other suggestions? I may be interested to some specific interactions rather than all possible ones.. – GNicoletti Dec 01 '20 at 22:49
  • See added part. – G. Grothendieck Dec 02 '20 at 00:08
  • yep, changing the "labs" object will definitely work for me! thank you so much for your help! – GNicoletti Dec 02 '20 at 10:33