4

I have a data set (below) with a power relationship. (Y =aX^b)

The power fit in Excel and xmgrace gave me an almost identical values for the fit. (R^2 of 0.993) Y = 215.47 X^0.812

However when I try R's nls() function I get a different value. Plus it does not calculate a R^2 because it is not statistically sound.

However if i take logarithms, I can do a lm() and get a R^2 of 0.993. How do I reproduce the values excel and xmgrace produce with power fits using R..Is R nls() not correct??

Drift Time  Mass_Independent CS
2.32    407.3417277
2.32    419.1267553
2.81    503.9859708
2.92    501.0465281
3.78    640.9024985
4.00    688.7906761
4.48    776.3958584
5.67    918.9991003
6.05    949.4448047
6.86    993.9763311
6.86    1064.539603
6.97    1041.422648
7.94    1112.407393
8.42    1183.070416
9.23    1302.622263
9.29    1291.525748
Tetsujin no Oni
  • 7,300
  • 2
  • 29
  • 46
lochi
  • 860
  • 2
  • 12
  • 26

1 Answers1

4

I think you would be foolish to trust an Excel estimate over an R estimate. The failings of Excel in the domain of regression are longstanding and well documented:

 nls(Mass_Ind_CS ~a*Drift_Time^b , dat, start=list(a=100, b=1))
#---------------------
Nonlinear regression model
  model:  Mass_Ind_CS ~ a * Drift_Time^b 
   data:  dat 
       a        b 
227.0176   0.7828 
 residual sum-of-squares: 10224

Number of iterations to convergence: 5 
Achieved convergence tolerance: 3.617e-06 
#---------------------
 plot(dat, xlim=range(dat$Drift_Time), ylim=range(dat$Mass_Ind_CS) )
 par(new=T)
 curve(215.47*x^0.812, from=min(dat$Drift_Time), 
                        to=max(dat$Drift_Time),
                         ylim=range(dat$Mass_Ind_CS) )
 par(new=T)
 curve(227.0176*x^0.7828, from=min(dat$Drift_Time), 
                          to=max(dat$Drift_Time), 
                          ylim=range(dat$Mass_Ind_CS),col="red")

The R estimate is plotted in red. It shows that you are wrong to focus on the parameter estimate without looking at the predictions over the range of the x=values. There is no real R-sq to estimate for solitary non-linear models, although you can do model comparisons with anova(). You are welcome to search out the reasons for the author of nls (Douglas Bates) not including them, because it is practically a FAQ on the r-help mailing list.

enter image description here

IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • OK..I didn't mean excel is better..It gave me a power fit that is very similar to what xmgrace produced. If you take logs you can build a linear model and get a R^2 Value..That value was (0.993)..Very similar to what I observed with power fits with excel and xmgrace.. Do you think building a linear model with logs is better than doing it this way ?? – lochi Feb 11 '12 at 02:44
  • If you fit a log(Y) ~ a + b*X model with Gaussian errors it is going to be different than the nls() result. And both those options are different than a log linear model done with glm(). In many cases this choice is better made with prior domain knowledge than on comparisons of fit that have no statistical basis for resolution. – IRTFM Feb 11 '12 at 03:16
  • Here's what I got with a linear model done in R with a log transformation 215.5084 * x^0.812 I believe both Excel and Xmgrace does just that..Thanks for answering – lochi Feb 11 '12 at 03:49
  • BTW, instead of R^2, take a look at things like rms error (predict vs. data points), or as DWin said, `anova` results. And above, all, ask yourself: what magnitude of prediction error can you accept? If, e.g., the difference between Excel's result and the correct (snarky me!) result is 10% of the acceptable prediction error limit, then you don't care. – Carl Witthoft Feb 11 '12 at 17:06