Negative exponential fit: curve looks too high

Question

I am trying to fit a negative exponential to some data in R, but the fitted line looks too high compared to the data, whereas the fit I get using Excel's built-in power fit looks more believable. Can someone tell me why? I've tried using the nls() function and also optim() and get similar parameters from both of those methods, but the fits for both look high.

   x    <- c(5.96, 12.86, 8.40, 2.03, 12.84, 21.44, 21.45, 19.97, 8.92, 25.00, 19.90, 20.00, 20.70, 16.68, 14.90, 26.00, 22.00, 22.00, 10.00, 5.70, 5.40, 3.20, 7.60, 0.59, 0.14, 0.85, 9.20, 0.79, 1.40, 2.68, 1.91)
   y    <- c(5.35, 2.38, 1.77, 1.87, 1.47, 3.27, 2.01, 0.52, 2.72, 0.85, 1.60, 1.37, 1.48, 0.39, 2.39, 1.83, 0.71, 1.24, 3.14, 2.16, 2.22, 11.50, 8.32, 38.98, 16.78, 32.66, 3.89, 1.89, 8.71, 9.74, 23.14)

    xy.frame <- data.frame(x,y)

    nl.fit <- nls(formula=(y ~ a * x^b), data=xy.frame, start = c(a=10, b=-0.7))

    a.est <- coef(nl.fit)[1]
    b.est <- coef(nl.fit)[2]

    plot(x=xy.frame$x,y=xy.frame$y)

    # curve looks too high
    curve(a.est * x^b.est , add=T)
    # these parameters from Excel seem to fit better
    curve(10.495 * x^-0.655, add=T)

    # alternatively use optim()
    theta.init <- c(1000,-0.5, 50)

    exp.nll <- function(theta, data){
      a <- theta[1]
      b <- theta[2]
      sigma <- theta[3]
      obs.y <- data$y
      x <- data$x
      pred.y <- a*x^b
      nll <- -sum(dnorm(x=obs.y, mean=pred.y , sd=sigma, log=T))
      nll
    }

    fit.optim <- optim(par=theta.init,fn=exp.nll,method="BFGS",data=xy.frame )

    plot(x=xy.frame$x,y=xy.frame$y)

    # still looks too high
    curve(a.est * x^b.est, add=T)

josliber · Accepted Answer · 2015-10-20T21:49:53.310

11

The reason you're seeing the unexpected behavior is that the curves that look "too high" actually have much lower sums of squared errors than the curves from excel:

# Fit from nls
sum((y - a.est*x^b.est)^2) 
# [1] 1588.313

# Fit from excel
sum((y - 10.495*x^ -0.655)^2)
# [1] 1981.561

The reason nls favors the higher curve is that it is working to avoid huge errors at small x values at the cost of slightly larger errors with large x values. One way to address this might be to apply a log-log transformation:

mod <- lm(log(y)~log(x))
(a.est2 <- exp(coef(mod)["(Intercept)"]))
# (Intercept) 
#    10.45614 
(b.est2 <- coef(mod)["log(x)"])
#     log(x) 
# -0.6529741

These are quite close to the coefficients from excel, and yield a more visually appealing fit (despite the worse performance on the sum-of-squared-errors metric):

edited Oct 20 '15 at 21:49

answered Oct 20 '15 at 21:44

josliber

43,891
12
98
133

Just out of curiousity, if Excel isn't trying to minimize the SSE, what criterion is it using? – eipi10 Oct 21 '15 at 02:07
@eipi10 Though I'm not positive, [it looks like](http://www.real-statistics.com/regression/power-regression/) it is using a log-log transformation, as well. Therefore, it's minimizing the SSE when predicting `log(y)` instead of minimizing the SSE when predicting `y`. – josliber Oct 21 '15 at 02:16
I also had issues with the negative exponential having different outputs in R and in Excel -- using the log-log transformation in R yielded the same results as the Excel. Another case where Excel does some hidden arithmetic to make them "look" nicer without informing the user! – user3386170 Jul 25 '22 at 14:39

Negative exponential fit: curve looks too high

1 Answers1