Using optim to choose initial values for nls

Question

One method I have seen in the literature is the use of optim() to choose initial values for nonlinear models in the package nls or nlme, however, I am puzzled by the actual implementation.

Take an example using COVID data from Alachua, FL:

dat=data.frame(x=seq(1,10,1), y=c(27.9,23.1,24.6,33.0,48.0,136.4,243.4,396.7,519.9,602.8))

x are time points and y is the number of people infected per 10,000 people

Now, if I wanted to fit a four-parameter logistic model in nls, I could use

n1 <- nls(y ~ SSfpl(x, A, B, M, S), data = dat)

But now imagine that parameter estimation is highly sensitive to the initial values so I want to optimize my approach. How would this be achieved?

The way I have thought to try is as follows

fun_to_optim <- function(data, guess){
          
          x       = data$x
          y       = data$y 
          
          A   = guess[1]
          B   = guess[2] 
          M   = guess[3] 
          S   = guess[4]
          
          y = A + (B-A)/(1+exp((M-x)/S)) 
          return(-sum(y)) }

optim(fn=fun_to_optim, data=dat, 
  par=c(10,10,10,10),
  method="Nelder-Mead")

The result from optim() is wrong but I cannot see my error. Thank you for any assistance.

score 6 · Accepted Answer · answered Aug 13 '21 at 23:33

The main issue is that you're not computing/returning the sum of squares from your objective function. However: I think you really have it backwards. Using nls() with SSfpl is about the best you're going to do in terms of optimization: it has sensible heuristics for picking starting values (SS stands for "self-starting"), and it provides a gradient function for the optimizer. It's not impossible that, with a considerable amount of work, you could find better heuristics for picking starting values for a particular system, but in general switching from nls to optim + Nelder-Mead will leave you worse off than when you started (illustration below).

fun_to_optim <- function(data, guess){
    x       = data$x
    y       = data$y 
    
    A   = guess[1]
    B   = guess[2] 
    M   = guess[3] 
    S   = guess[4]
    
    y_pred = A + (B-A)/(1+exp((M-x)/S)) 
    return(sum((y-y_pred)^2))
}

Fit optim() with (1) your suggested starting values; (2) better starting values that are somewhere nearer the correct values (you could get most of these values by knowing the geometry of the function — e.g. A is the left asymptote, B is the right asymptote, M is the midpoint, S is the scale); (3) same as #2 but using BFGS rather than Nelder-Mead.

opt1 <- optim(fn=fun_to_optim, data=dat, 
  par=c(A=10,B=10,M=10,S=10),
  method="Nelder-Mead")
opt2 <- optim(fn=fun_to_optim, data=dat, 
  par=c(A=10,B=500,M=10,S=1),
  method = "Nelder-Mead")
opt3 <- optim(fn=fun_to_optim, data=dat, 
  par=c(A=10,B=500,M=10,S=1),
  method = "BFGS")

Results:

xvec <- seq(1,10,length=101)
plot(y~x, data=dat)
lines(xvec, predict(n1, newdata=data.frame(x=xvec)))
p1 <- with(as.list(opt1$par), A + (B-A)/(1+exp((M-xvec)/S)))
lines(xvec, p1, col=2)
p2 <- with(as.list(opt2$par), A + (B-A)/(1+exp((M-xvec)/S)))
lines(xvec, p2, col=4)
p3 <- with(as.list(opt3$par), A + (B-A)/(1+exp((M-xvec)/S)))
lines(xvec, p3, col=6)
legend("topleft", col=c(1,2,4,6), lty=1,
       legend=c("nls","NM (bad start)", "NM", "BFGS"))

nls and good starting values + BFGS overlap, and provide a good fit
optim/Nelder-Mead from bad starting values is absolutely terrible — converges on a constant line
optim/N-M from good starting values gets a reasonable fit, but obviously worse; I haven't analyzed why it gets stuck there.

Great, thank you for helping me see this more clearly. I appreciate it. — Constantin, Aug 14 '21 at 01:49

Using optim to choose initial values for nls

1 Answers1