I'm trying to develop some code to find the significance of using an auxiliary data source to improve the predictability of a final product. I have the data ready in matlab, which is my preferred program for analysis.
I'm trying to solve the following equation.
P(t,i) = a(i) + b(i)*Z(t,i) + c(i)*Y(t,i) + d(i)*X(t,i) + e(i)*W(i)
Where, P, Z, Y, X, W are known, t and i are indices and I wish to find the values for a, b, c, d and e which minimise the difference between existing value of P and the predicted value of P.
t = 1:20 and i ~ 1:250000
Eventually I will set the value of e(i) to zero and see how much improvement I get from adding the extra variable, before testing with a random number stream too.
If more detail is needed I will try to provide it, many thanks.
I've tried the method suggested below however because my Z, Y and X values are matrices then the output matrix sol is 3 times the width of t + the one element of e. I've read further around and think the method should be one of either the Generalised linear model or the Panel regression model but I'm not sure how to set one up. I've re-read the examples from mathworks a few times and am still confused.