0

This is the function that I'd like to code in R,

enter image description here

i = 1,2,3,....j-1

a,b,c,f,g are to be determined from nls (with starting value arbitrarily set to 7,30,15,1,2)

S and Y are in the dataset

The function can be presented in a more computational friendly recursive equations,

enter image description here

Here is my attempt at the code but I could not get it to converge,

S=c(235,90,1775,960,965,1110,370,485,667,140,588,10,0,1340,600,0,930,1250,930,120,895,825,0,935,695,270,0,610,0,0,445,0,0,370,470,819,717,0,0,60,0,135,690,0,825,730,1250,370,1010,261,0,865,570,1425,150,1515,1143,0,675,1465,375,0,690,290,0,430,735,510,270,450,1044,0,928,60,95,105,60,950,0,1640,3960,1510,500,1135,0,0,0,181,568,60,1575,247,0,1270,870,290,510,0,540,455,120,580,420,90,525,1116,499,0,60,150,660,1080,1715,90,1090,840,975,280,850,633,30,1530,1765,880,150,225,77,1380,810,835,0,540,1017,1108,0,300,600,90,370,910,0,60,60,0,0,0,0,50,0,735,900)

Y=c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,7.7,NA,NA,7.2,NA,NA,NA,NA,NA,NA,7.4,NA,NA,NA,NA,NA,NA,10.7,NA,NA,NA,NA,8.1,8.5,NA,NA,NA,NA,NA,9.9,NA,7.4,NA,NA,NA,9.5,NA,NA,9,NA,NA,NA,8.8,NA,NA,8.5,NA,NA,NA,6.9,NA,NA,7.9,NA,NA,NA,7.3,NA,7.9,8.3,NA,NA,NA,11.5,NA,NA,12.3,NA,NA,NA,6.1,NA,NA,9,NA,NA,NA,10.3,NA,NA,9.7,NA,NA,8.6,NA,9.1,NA,NA,11,NA,NA,12.4,11.1,10.1,NA,NA,NA,NA,11.7,NA,NA,9,NA,NA,NA,10.2,NA,NA,11.2,NA,NA,NA,11.8,NA,9.2,10,9.8,NA,9.5,11.3,10.3,9.5,10.2,10.6,NA,10.8,10.7,11.1,NA,NA,NA,NA,NA,NA,NA,NA,12.6,NA)

mydata = data.frame(Y,S)

f <- function(a,b,f,c,g,m) {

    model <- matrix(NA,nrow(m)+1,3)

    model[1,1]=0
    model[1,2]=0
    model[1,3]=a

    for (i in 2:nrow(model)){
        model[i,1]=exp(-1/c)*model[i-1,1] + m$S[i-1] 
        model[i,2]=exp(-1/g)*model[i-1,2] + m$S[i-1]
        model[i,3]=a+b*model[i,1]-f*model[i,2]
    }
    model <- as.data.frame(model)
    colnames(model) = c('l','m','Y')
    model$Y[which(m$Y>0)]
}

Y=mydata$Y
nls(Y ~ f(a,b,f,c,g,mydata), start=list(a=7,b=5.3651,f=5.3656,c=16.50329,g=16.5006),control=list(maxiter=1000,minFactor=1e-12))

Errors that I've been getting depends on the starting values are:

Error in nls(Y ~ f(a, b, f, c, g, mydata), start = list(a = 7, :
number of iterations exceeded maximum of 1000

Error in nls(Y ~ f(a, b, f, c, g, mydata), start = list(a = 7, :
singular gradient

I'm stuck and not sure what to do, any help would be greatly appreciated.

mct
  • 17
  • 3
  • a) you have to construct mydata as data.frame to use $: `mydata <- data.frame(Y,S)`. b) what is `p0`? c) can you run `f(7,5,16,16,mydata)`? Do you get correct results? If not, nls is not going to work... – Ott Toomet Dec 17 '15 at 17:28
  • sorry, p0 = a, correction updated. f(7,5,16,16,mydata) works, it generates the predicted Y – mct Dec 17 '15 at 17:36

1 Answers1

1

Try this:

ff <- function(a,b,f,c,g) {
   Y <- numeric(length(S))
   for(i in seq(from=2, to=length(S))) {
      j <- seq(length=i-1)
      Y[i] <- a + sum((b*exp(-(i-j)/c) - f*exp(-(i-j)/g))*S[j])
   }
   Y
}

S <- c(235,90,1775,960,965,1110,370,485,667,140,588,10,0,1340,600,0,930,1250,930,120,895,825,0,935,695,270,0,610,0,0,445,0,0,370,470,819,717,0,0,60,0,135,690,0,825,730,1250,370,1010,261,0,865,570,1425,150,1515,1143,0,675,1465,375,0,690,290,0,430,735,510,270,450,1044,0,928,60,95,105,60,950,0,1640,3960,1510,500,1135,0,0,0,181,568,60,1575,247,0,1270,870,290,510,0,540,455,120,580,420,90,525,1116,499,0,60,150,660,1080,1715,90,1090,840,975,280,850,633,30,1530,1765,880,150,225,77,1380,810,835,0,540,1017,1108,0,300,600,90,370,910,0,60,60,0,0,0,0,50,0,735,900)
Y <-  c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,7.7,NA,NA,7.2,NA,NA,NA,NA,NA,NA,7.4,NA,NA,NA,NA,NA,NA,10.7,NA,NA,NA,NA,8.1,8.5,NA,NA,NA,NA,NA,9.9,NA,7.4,NA,NA,NA,9.5,NA,NA,9,NA,NA,NA,8.8,NA,NA,8.5,NA,NA,NA,6.9,NA,NA,7.9,NA,NA,NA,7.3,NA,7.9,8.3,NA,NA,NA,11.5,NA,NA,12.3,NA,NA,NA,6.1,NA,NA,9,NA,NA,NA,10.3,NA,NA,9.7,NA,NA,8.6,NA,9.1,NA,NA,11,NA,NA,12.4,11.1,10.1,NA,NA,NA,NA,11.7,NA,NA,9,NA,NA,NA,10.2,NA,NA,11.2,NA,NA,NA,11.8,NA,9.2,10,9.8,NA,9.5,11.3,10.3,9.5,10.2,10.6,NA,10.8,10.7,11.1,NA,NA,NA,NA,NA,NA,NA,NA,12.6,NA)
nls(Y ~ f(a,b,f,c,g,mydata), start=list(a=7,b=5.3651,f=5.3656,c=16.50329,g=16.5006))

But I am unable to get nls to run here. You may also try a general-purpose optimizer. Construct the sum of squares function (-sum of squares as we maximize it):

SS <- function(par) {
   a <- par[1]
   b <- par[2]
   f <- par[3]
   c <- par[4]
   g <- par[5]
  -sum((Y - ff(a,b,f,c,g))^2, na.rm=TRUE)
}

and maximize:

library(maxLik)
summary(a <- maxBFGS(SS, start=start))

It works, but as you see the gradients are still pretty large. I get gradients small if I re-run a NR optimizer on the output values of BFGS:

summary(b <- maxNR(SS, start=coef(a)))

which gives the results

Newton-Raphson maximisation 
Number of iterations: 1 
Return code: 2 
successive function values within tolerance limit 
Function value: -47.36338 
Estimates:
   estimate      gradient
a 10.584488  0.0016371615
b  6.954444 -0.0043306656
f  6.955095  0.0043327901
c 28.622035 -0.0005735572
g 28.619185  0.0003871179

I don't know if this makes sense. The issues with nls and the other optimizers hint that you have numerical instabilities, either related to large numerical values, or the difference of exponents in the model formula.

Check what is going on there :-)

Ott Toomet
  • 1,894
  • 15
  • 25
  • I got an error typing the command "summary (a <- maxBFGS(SS, start=start)", shouldn't I put specify SS(?)? I also changed the starting value to f(10,10,5,10,20,mydata) and it converged. The numbers looks more correct than the maxNR results, b&f and c&g number should not be that similar, they should be 2-3x difference. Also, even if it converges now with my original code, I won't be able to guess the proper starting value if presented a different set of data. how can I determine the right starting value? maybe your maxNR method is better for this reason. – mct Dec 18 '15 at 20:20
  • What is the error message you get? There are no general good way to get good initial values---it would amount to solving the problem in the first place. You may try a robust global optimiser, like SANN, or Nelder-Mead (robust but not global). Run it a few hundred times and take those coefficients for start values for N-R or BFGS. Analytic gradients may also be of great help if you have numerical issues. – Ott Toomet Dec 18 '15 at 20:50