1

I am interested to reproduce results calculated by the GNU plugin to MS Word WordMat in R, but I can't get them to arrive at similar results (I am not looking for identical, but simply similar).

I have some y and x values and a power function, y = bx^a

Using the following data,

x <- c(15,31,37,44,51,59)
y <- c(126,71,61,53,47,42)

I get a = -0.8051 and b = 1117.7472 in WordMat, but a = -0.8026 and B = 1108.2533 in R, slightly different values.

Am I using the nls function in some wrong way or is there a better (more transparent) way to calculate it in R?

Data and R code,

# x <- c(15,31,37,44,51,59)
# y <- c(126,71,61,53,47,42)
df <- data.frame(x,y)
moD <- nls(y~a*x^b, df, start = list(a = 1,b=1))
summary(moD)

Formula: y ~ a * x^b

Parameters:
    Estimate Std. Error t value Pr(>|t|)    
a  1.108e+03  1.298e+01   85.35 1.13e-07 ***
b -8.026e-01  3.626e-03 -221.36 2.50e-09 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.3296 on 4 degrees of freedom

Number of iterations to convergence: 19 
Achieved convergence tolerance: 5.813e-06
Eric Fail
  • 8,191
  • 8
  • 72
  • 128
  • Well, mean abs. errors are: MAE_WordMat = 0.218, MAE_R = 0.195, so clearly R is doing better. – m-dz Jul 08 '16 at 08:32
  • Thank you for responding to my question. How did you obtain the Mean absolute error (MAE) from WordMat and from R? Could you possible provide a reference that can help me convince my teacher that R is doing better and WordMat is not? – Eric Fail Jul 08 '16 at 11:12
  • Just calculated `y_WordMat` and `y_R` and compared with `y`, it just showed that both are close, but R is closer. – m-dz Jul 08 '16 at 14:02

1 Answers1

1

It looks like WordMat is estimating the parameters of y=b*x^a by doing the log-log regression rather than by solving the nonlinear least-squares problem:

> x <- c(15,31,37,44,51,59)
> y <- c(126,71,61,53,47,42)
> 
> (m1 <- lm(log(y)~log(x)))

Call:
lm(formula = log(y) ~ log(x))

Coefficients:
(Intercept)       log(x)  
     7.0191      -0.8051  
> exp(coef(m1)[1])
(Intercept) 
   1117.747 

To explain what's going on here a little bit more: if y=b*x^a, taking the log on both sides gives log(y)=log(b)+a*log(x), which has the form of a linear regression (lm() in R). However, log-transforming also affects the variance of the errors (which are implicitly included on the right-hand side of the question), meaning that you're actually solving a different problem. Which is correct depends on exactly how you state the problem. This question on CrossValidated gives more details.

Community
  • 1
  • 1
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
  • Thanks, this is super helpful! – Eric Fail Jul 16 '16 at 13:46
  • StackOverflow deprecates [using comments to say "thank you"](http://meta.stackoverflow.com/questions/258004/should-thank-you-comments-be-flagged?lq=1); if this answer was useful you can upvote it (if you have sufficient reputation), and in any case if it answers your question satisfactorily you are encouraged to click the check-mark to accept it. – Ben Bolker Jul 17 '16 at 16:29
  • I appreciate you educate me about SO culture, I guess that's not a direct "thank you." I apologize for not _check-mark to accept_ your answer. I was too focused to rewarding you the bounty. – Eric Fail Jul 17 '16 at 17:39