0

I'm running a nonlinear multiple regression with the nls() function with one dependent variable (Gross Primary Production (GPP)) and three independent variables (solar irradiance (RAD), Green Fractional Cover (GFC) and Volumetric Water Content (VWC)). I'm trying to follow the model of Magnani et al. (2022) which is: GPP = (Fα0RAD/ F+ α0RAD) * (A0+A1GFC+A2VWC)+ε, where F, α0, A0, A1, A2 are the parameters to estimate.

This is the code I used:

nls.3<- nls(GPP~(F*α0*RAD/(F+α0*RAD))*(A0+(A1*GFC)+(A2*VWC)), data = SCALED,start=list(F=-2.16, α0=-0.031, A0=0.021, A1=7.31, A2=0.0024),control=nls.control( minFactor=2^-148, warnOnly=TRUE,maxiter=10000))

In this attempt I took as starting values the estimated parameters of the cited model (my data are from the same site, but the year is different).

This is the output I got:

Formula: GPP ~ (F * α0 * RAD/(F + α0 * RAD)) * 
    (A0 + (A1 * GFC) + (A2 * VWC))

Parameters:
    Estimate Std. Error t value Pr(>|t|)
F  -4.063e+00  2.488e+08       0        1
α0 -5.831e-02  3.571e+06       0        1
A0  2.508e-03  1.536e+05       0        1
A1  8.720e-01  5.341e+07       0        1
A2  2.864e-04  1.754e+04       0        1

Residual standard error: 1.003 on 278 degrees of freedom

Number of iterations till stop: 10000 
Achieved convergence tolerance: 0.5849
Reason stopped: il numero di iterazioni ha superato il massimo di 10000

------
Residual sum of squares: 279

------
t-based confidence interval:

------
Correlation matrix:
  a b L d e
a 1 1 1 1 1
b 1 1 1 1 1
L 1 1 1 1 1
d 1 1 1 1 1
e 1 1 1 1 1

I never saw a similar output with all the t statistics = 0 and all the p values = 1.

Can someone tell me what I'm doing wrong?

[or there is another way to run this model?]

Below is a sample of the head of my data (all the variables are standardized):

              RAD         GFC       VWC                GPP
1          -0.2491831 -1.0107985  1.4436443        0.3294411
2          -0.2171896 -0.8891009 -1.2268249        0.8456750
3          -0.1498026  0.9968661 -0.8714393       -0.4678534
4           0.2738084 -1.0062102 -1.6228261        0.3982723
5          -0.5789165 -0.6060990 -0.9932858        0.6449174
6           0.1203928 -0.6509521 -0.4957459        0.1057398

Fra AdV
  • 1
  • 1
  • 3
  • Your model had not converged. The Achieved convergence tolerance is very similar to the outcome GPP. I would use 10x more iterations and see what happens – danlooo Apr 06 '22 at 09:15
  • @danlooo Thank you very much for your comment! I tried to increase the iterations up to 10 millions, but unfortunately the output hasn't changed... – Fra AdV Apr 06 '22 at 11:18
  • What happens if you use another more simpler formula with less terms? Maybe you also have to less data to fit something meaningful – danlooo Apr 06 '22 at 11:26
  • @danlooo The first model I run with these data was simpler `nls.tot<-nls(GPP~a*b*RAD/(a+b*RAD), data = GPP.TOT,start=list(a=-2.16, b=-0.031))` and the output was better (t statistics and p values reasonable), but now my aim is to fit this complex model on these data, because I need to see if it can be generalized. (My dataset n is 283). – Fra AdV Apr 06 '22 at 13:22
  • The formula has many roots making it usually unsuitable to solve with convex solvers using least squared e.g. nls. Was nls successfully applied to this formula in the past? – danlooo Apr 06 '22 at 14:19
  • @danlooo well, no, because this formula was applied with MATLAB functions in the past. It is the first time with R and nls... I've to find a way to apply this formula on R, but I wasn't able to find any other function than nls... – Fra AdV Apr 06 '22 at 14:41
  • Try to get the fitting method used by MATLAB and read https://www.r-bloggers.com/2012/07/a-better-nls/. – danlooo Apr 06 '22 at 14:47
  • @danlooo thank you very much! I'm going to ask the Matlab code and read the link! – Fra AdV Apr 06 '22 at 14:57

1 Answers1

0

This is not a software problem. It is a fundamental problem with the model as it is not identifiable. If (F, a0, A1, A2, A3) is a solution then replacing it with (F/s, a0/s, sA1, sA2, s*A3) gives the same fitted values for any non-zero scalar s. The answer obtained with the other software is somewhat misleading as there are an infinite number of solutions.

For example, make this transformation: a0=b0*F and Ai=Bi/(b0*F) for i=0,1,2. Then the right hand side becomes the following in 4, rather than 5, parameters.

(RAD/(1 + b0 * RAD)) * (B0 + (B1 * GFC) + (B2 * VWC))

Using the plinear algorithm of nls we only need to provide starting values for the parameters that enter non-linearly so only b0 needs a starting value. To use it provide a matrix on the right hand side with one column per linear parameter as shown. You may need a different starting value.

nls(GPP ~ RAD/(1 + b0 * RAD) * cbind(B0 = 1, B1 = GFC, B2 = VWC),
  data = SCALED, start = list(b0 = 1), algorithm = "plinear")
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
  • Thank you very much! It worked! So now I've 4 parameters estimated, but since I'm not prepared on this topic (as you can easily immagine) can I ask you how I can get the original parameters (a0, Ai) with the transformation? The F is still missing... – Fra AdV Apr 07 '22 at 12:36
  • The whole point of this is that the parameters are not identifiable. Any (F, a0, Ai) which maps to a given (b0, Bi) gives the same predictions as all others and there are an infinite number of such combinations. If you find this hard to understand consider a simpler problem of finding the (x, y) pair which minimizes x+y subject to x+y >= 0. Any (x, y) pair for which x = -y will work so what does it mean to find x and y? – G. Grothendieck Apr 07 '22 at 12:47
  • Yes, I can understand this simpler version, thank you! Now I think I need to focus on the comprehension of the model (because after your answers I read again the two paper and I think I missed some fundamental steps) they passed me to understand how they get to the result (because if it is unidentifiable with R the same should be with matlab)...Thanks for your patience! – Fra AdV Apr 07 '22 at 13:56