I'm trying to fit nonlinear curves while specifying a covariance structure based on a phylogeny. To do this, I am using gnls from nlme to fit my model, while the covariance structure is created using corBrownian from ape. The analysis runs fine, but I am finding that gnls doesn't produce a curve with parameters that fit my data well.
To reproduce what is happening, I've created a bogus dataset with a seemingly nonlinear relationship between the x and y variables.
> bogus<-read.table('bogus.txt',header=T)
> library(nlme)
> library(ape)
> bogus
x y
t1 11.55906 2.8300968
t2 11.69827 2.3313596
t3 11.76451 2.9851054
t4 12.08264 2.5448035
t5 12.12407 2.4908419
t6 12.25148 2.1676187
t7 12.53406 2.7367371
t8 12.56658 2.9068268
t9 12.63566 2.7088351
t10 12.74374 2.1415571
t11 12.75834 2.3817043
t12 12.82976 2.7454244
t13 12.89204 2.1909294
t14 12.90101 3.2203981
t15 13.14088 2.6648673
t16 13.26564 2.4982850
t17 13.35068 3.1950533
t18 13.46560 3.7170424
t19 14.27643 2.4693087
t20 14.50607 2.5048268
t21 14.77404 3.3430531
t22 15.02226 1.9006652
t23 15.58785 3.0900789
t24 16.54972 2.5029559
t25 16.62484 2.1646191
t26 17.12499 2.6146433
t27 17.82371 3.2210873
t28 18.80876 2.5716595
t29 19.82221 2.7564073
t30 19.97193 2.7730234
t31 20.02177 2.9545869
t32 20.72805 2.9256731
t33 21.23050 2.0299999
t34 21.55609 0.0000000
t35 21.60317 1.9895633
t36 21.62067 1.3089420
t37 21.73584 1.2215530
t38 21.84590 1.1517204
t39 22.04924 0.8980439
t40 22.21454 1.5719580
t41 22.21508 1.9236154
t42 22.25437 2.0794511
t43 22.59725 0.9415377
t44 23.00222 0.5226221
t45 24.15081 0.5897821
t46 24.78190 0.0000000
t47 24.93000 0.5375178
t48 25.16983 0.9432479
t49 25.70231 0.3888160
t50 26.06668 0.0000000
t51 26.52506 0.8258348
t52 26.62164 0.0000000
t53 28.97916 0.0000000
t54 29.29434 0.6803803
t55 29.58485 0.2704070
t56 29.67962 0.0000000
t57 29.85867 0.5700013
t58 30.33632 0.0000000
t59 30.37679 0.4963074
t60 31.17407 0.0000000
Since this is made up data, I don't have a phylogeny for it, so I simulate it using rtree from ape (since the tree will be randomly generated, there will be slight discrepancies, but the overall result shouldn't be too different). Using this tree, I create my covariance structure and run my analysis using the model I specified (called 'mod', here).
> tree<-rtree(60)
> bogus_cov<-corBrownian(1,tree)
> mod<-y~A-(A/(1+(x/K)^S))
> bogus_fit<-gnls(mod,start=c(A=3,K=22,S=35),correlation=bogus_cov,data=bogus)
The results from this analysis:
> summary(bogus_fit)
Generalized nonlinear least squares fit
Model: mod
Data: bogus
AIC BIC logLik
131.5422 139.9196 -61.77112
Correlation Structure: corBrownian
Formula: ~1
Parameter estimate(s):
numeric(0)
Coefficients:
Value Std.Error t-value p-value
A -2.25642 0.178272 -12.65720 0.0000
K 21.79689 0.193781 112.48183 0.0000
S 33.94954 12.613109 2.69161 0.0093
Correlation:
A K
K -0.262
S 0.450 -0.041
Standardized residuals:
Min Q1 Med Q3 Max
0.7277937 1.8177889 2.1017363 2.3306128 2.9470426
Residual standard error: 1.261279
Degrees of freedom: 60 total; 57 residual
Just at first glance, we can see that the asymptote (A) is placed at -2, which is well below any y value that is in my bogus dataset. Furthermore, based on how my data are arranged, the S parameter should be negative. This is more evident when the data are plotted with the curve with the parameter estimates:
> plot(bogus,ylim=c(-5,5))
> lines(bogus$x,predict(bogus_fit),lwd=1.5,col='red')
I guess my question is, why is gnls not fitting the curve to my data? Isn't the whole point of least squares to minimize SSE? Is this a matter of a bad model? Is there something I'm missing or misunderstanding about nls/gnls in general? Is there a way to fix this so that the curve does fit the data better?