2

Given the following datapoints

enter image description here

I'm trying to find the best fitting model using the method of least squares.

Two models are given.

  1. enter image description here
  2. enter image description here

My approach was to rewrite the to equations into the following.


where

and


where

and


I wrote the following MATLAB code to compute the coefficients a,b for the different equations.

For the first equation I wrote the following code to evaluate the coefficient a

x = [150 200 300 500 1000 2000]';
y = [2 3 4 5 6 7]';
func =@(x) (1/x-1/8);
yy=arrayfun(func,y);
A = 1./x;
c= A\yy; yanp= A*c; error = yy-yanp;
rms(error) % Root mean square error.

Giving me a= 48.4692 with a root mean square error of 0.0310.

For the second equation I wrote the following code to evaluate the coefficients a,b.

x = [150 200 300 500 1000 2000]';
y = [2 3 4 5 6 7]';
yy = log(8-y);
A = [ones(6,1) log(x)];
c= A\yy; yanp= A*c;  error= yy-yanp;
a = exp(c(1)); %Converting back
b= c(2);
rms(error)

Giving me the a=174.5247, b= -0.6640 with a root mean square error of 0.0756

My results suggests that the first equation is the better approximation since the error is smaller, however my fellow students claim that the second equation gives the smaller error and hence is the better approximation. I suspect I've made a mistake somewhere in my calculations and I'm looking for guidance.

bullbo
  • 131
  • 10
  • I don't know... from a visual point of view, when plotting both regressions (`figure(),plot(x,y),hold on,plot(x,yanp),hold off;`) something tells me the second one is better. And it should return a lower RMSE since predicted deviates much less from observed values... there must be something wrong happening with the log transformation. – Tommaso Belluzzo Jan 24 '18 at 23:09

1 Answers1

1

In your second case you''re not computing your error correctly. You need to convert yanp back to "true units" and compare with the input y:

error = y-(8-exp(yanp));
Cris Luengo
  • 55,762
  • 10
  • 62
  • 120