0

I have two sets of age and length data for the same fish species, both provided in the following link.

And I would like to a fit growth model, using R, that allows for a change in the growth at a specific moment of the lifespan.

I tried using the nls function and provided starting values adapted to my data. The model is an adaptation of the Von Bertalanffy growth model that is supposed to return values for five different parameters (Linf, k0, t0, k1, and t1).

The code I used, for both datasets, was the folowwing:

fit <-as.formula(TL~ Linf * (1 - exp(-K0 * (Age - t0))) * (Age < t1) +
                   Linf * (1 - exp(-K0 * (t1 - t0) - K1 * (Age - t1))) * (Age > t1))

model<-nls(fit, data=dataset, start=list(Linf=17, K0=0.3, t0=-2, K1=0.1, t1=3), nls.control(maxiter = 500, tol = 1e-03, minFactor = 1/1024, printEval = FALSE, warnOnly = FALSE))
summary(model)

For the first dataset the values returned were the following:

Formula: TL ~ Linf * (1 - exp(-K0 * (Age - t0))) * (Age < t1) + Linf * 
    (1 - exp(-K0 * (t1 - t0) - K1 * (Age - t1))) * (Age > t1)

Parameters:
       Estimate Std. Error t value Pr(>|t|)    
Linf  4.089e+02  1.565e+04   0.026   0.9792    
K0    5.477e-03  2.141e-01   0.026   0.9796    
t0   -2.934e+00  1.500e+00  -1.956   0.0511 .  
K1    7.596e-04  3.004e-02   0.025   0.9798    
t1    2.246e+00  2.143e-01  10.477   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.881 on 457 degrees of freedom

Number of iterations to convergence: 294 
Achieved convergence tolerance: 0.000979

While for the second dataset, the values returned were:

Formula: TL ~ Linf * (1 - exp(-K0 * (Age - t0))) * (Age < t1) + Linf * 
    (1 - exp(-K0 * (t1 - t0) - K1 * (Age - t1))) * (Age > t1)

Parameters:
     Estimate Std. Error t value Pr(>|t|)    
Linf 15.04002    0.60919  24.689  < 2e-16 ***
K0    0.16740    0.01895   8.833  < 2e-16 ***
t0   -3.67353    0.34427 -10.671  < 2e-16 ***
K1    0.11986    0.02007   5.971 2.63e-09 ***
t1    2.29970    0.31711   7.252 5.18e-13 ***
---

Only the values returned for the second dataset make sense for the species in question.

Why is the nls function returning such different parameter values, while using the same model, same starting values and very similar datasets?

1 Answers1

2

I don't think there's anything wrong with the fits per se - they both look like reasonable fits to the given data. The problem appears to be that in the first set there is an apparent change in gradient that occurs around an age where there are relatively few data points.

Here's the plot for the first data set:

library(ggplot2)

fit <-as.formula(y~ Linf * (1 - exp(-K0 * (x - t0))) * (x < t1) +
                   Linf * (1 - exp(-K0 * (t1 - t0) - K1 * (x - t1))) * (x > t1))

ggplot(dataset, aes(Age, TL)) +
  geom_point() +
  geom_smooth(method = nls, formula = fit, method.args = list(
    start = list(Linf=17, K0=0.3, t0=-2, K1=0.1, t1=3), 
    control = list(maxiter = 10000, minFactor = 1e-9, tol = 1e-3)),
    se = FALSE, linetype = 2
  )

enter image description here

But the data, and the shape of the plot, is quite different for the second data set:

ggplot(dataset2, aes(Age, TL)) +
  geom_point() +
  geom_smooth(method = nls, formula = fit, method.args = list(
    start = list(Linf=17, K0=0.3, t0=-2, K1=0.1, t1=3), 
    control = list(maxiter = 10000, minFactor = 1e-9, tol = 1e-3)),
    se = FALSE, linetype = 2
  )

enter image description here

So the problem simply lies in your assumption that both data sets are similar. They are not very similar at all, at least in terms of fitting this model. For example, the first data set only has 52 individuals (11%) under the age of 4, but the second data set has 1279 (42%). There is clearly a big difference in the age distribution of the two samples. Note that combining the two data frames using rbind gives one big model that is similar to the values obtained for dataset2 alone.

Allan Cameron
  • 147,086
  • 7
  • 49
  • 87