Why do I get different power law models between R and Python?

Question

I've noticed that when I make a power law model in R, it almost always under-predicts compared to one made in Python. Here is an R example:

#make data
x <- seq(1,10,1)
y <- c(1,2.1,3.5, 4.5, 4.8, 6, 9, 14, 20, 24.75)

model.pl <- lm(log(y) ~ log(x))
x_axis <- seq(1,10,1)
lines(x_axis, exp(predict(model.pl, data.frame(x=x_axis))), col = 'red', lty = 2, lwd=2)
summary(model.pl)

print(exp(model.pl$coefficients[1])) # 0.8093913 
print(model.pl$coefficients[2]) # 1.325091

And in Python using Scipy:

lst = list(range(1,11,1))
lst2 = [1,2.1,3.5, 4.5, 4.8, 6, 9, 14, 20, 24.75]

df = pd.DataFrame(list(zip(lst, lst2)),
               columns =['x', 'y'])

def objective(x, a, b):
    return a * x ** b

x, y = df["x"], df["y"]

popt, _ = curve_fit(objective, x, y)

a, b = popt
print(a) #0.10104993 
print(b) #2.38573123

So I end up with quite different parameters, which translates to different models when plotted below. Is this to be expected? I assumed they would fit essentially identical models:

As per phipsgabler, I modified the call, but this does not seem to change the model all that much:

model.pl <- lm(log(y) ~ log(x)-1)
x_axis <- seq(1,10,1)
lines(x_axis, exp(predict(model.pl, data.frame(x=x_axis))), col = 'red', lty = 2, lwd=2)
summary(model.pl)
print(exp(model.pl$coefficients[1]))
print(model.pl$coefficients[2])

I think R includes an intercept by default. Could you try to compare with `log(y) ~ log(x) - 1`? — phipsgabler, Aug 25 '22 at 08:23
This doesn't change the trajectory of the model that much. Also, I'm not sure how i would back transform the model so I can write it out the formula without the log transforms. — hmnoidk, Aug 25 '22 at 08:30
https://stats.stackexchange.com/questions/254689/linear-regression-with-log-transformed-data-large-error/254706#254706 — Roland, Aug 25 '22 at 08:31
https://stats.stackexchange.com/questions/254876/regression-models-for-log-transformed-data-without-multiplicative-error/255265#255265 — Roland, Aug 25 '22 at 08:33
Ah okay, your second thread seems to have helped me Roland. So basically, the `curve_fit` function in Python includes the error term, if I'm understanding correctly? — hmnoidk, Aug 25 '22 at 11:13
A regression model always includes an error term. The point is that one of your models includes a multiplicative and the other an additive error term. — Roland, Aug 25 '22 at 11:17
How can I extract the residuals to do RMSE from the NLS version of the model? Or are these assumed to be the same as that of the model that was fed to the NLS version? — hmnoidk, Aug 25 '22 at 12:24
Either fit an OLS log-log model with Python or fit a non-linear regression model with R. You have no chance of getting the same results if you use two statistical models with different assumptions. So, you first need to make a choice regarding statistical methodology (see the links above) before you work on the implementation. — Roland, Aug 25 '22 at 13:45

Why do I get different power law models between R and Python?

0 Answers0