Modeling
As a modelization problem you have to choose between how to go from the 4 dimensions space to the mono dimension space. You need a projection function : p :- R4 -> R
The projection function will mark a part of the knowledge you have about your problem.
If you know the the first parameters as lot more importance than other then you could go on a function p:(x,y,z,t) -> x and forget others parameters. In general you don't have this knowledge. So we use the occam's razor (this is machine learning part of this modeling part of the problem), and keep the model as simple as possible but no more :
I choose for the example : (1) p : (x,y,z,t) -> x+y+z+t
so each parameters of a sample will contribute in the same way to the result; Another solution could be : (2) *p : (x,y,z,t) -> x*y*z*t*
But with log transformation on (2) you get something that look like (1).
The learning function you choose is f(x)->a*x^3+b*x^2+c*x+d. You must take care of the way you apply projection to the learning function.
Appling 'p' to the model doesn't give :
f(x,y,z,t) ->
a*(x + y + z + y)^3 +
b*(x + y + z + y)^2 +
c*(x + y + z + y)^1 +
d*(x + y + z + y)^0
but :
f(x,y,z,t) ->
a*(x)^3 + b*(x)^2 + c*(x) + d +
a*(y)^3 + b*(y)^2 + c*(y) + d +
a*(z)^3 + b*(z)^2 + c*(z) + d +
a*(t)^3 + b*(t)^2 + c*(t) + d
This is the property of indepency of your parameters : you apply the learning function on each parameters. The knowledge that link parameters as already being encoded as part of the choice of the '+' operator between each parameter.
So, solving the learning problem of parameters (a1,b1,c1,d1) should be okay for your model :
f(x,y,z,t) ->
a1*(x^3 + y^3 + z^3 + t^3) +
b1*(x^2 + y^2 + z^2 + t^2) +
c1*(x + y + z + t) +
d1
Solving
Working with gnuplot this problems can be solve with the fit function (chapter 7).
f(x,y,z,t) = a1*(x**3 + y**3 + z**3 + t**3) + b1*(x**2 + y**2 + z**2 + t**2) + c1*(x + y + z + t) + d1
fit f(x,y,t,u) 'mydata.dat' using 1:2:3:4 via a1, b1, c1, d1
NB1 : when using fit the variable name 'z' is reserved for something else (help fit); so you must rename the variable when using 'fit'.
NB2 : mydata.dat contains the samples, where each column is separated by a tabulation. You must also add the know value of 'y' for each sample. So in 'mydata.dat' there is 5 columns. (and the file name must terminate by '.dat').
so here is mydata.dat file I use :
1.5 2.3 4.2 0.9 1.0
1.2 0.3 1.2 0.3 2.0
0.5 1.3 2.2 1.5 3.0
4.2 2.5 3.2 6.2 4.0
As you can see, I have add the 'y' columns that give the expected value for each sample.
Then run the tool in a console :
gnuplot> f(x,y,z,t) = a1*(x**3 + y**3 + z**3 + t**3) + b1*(x**2 + y**2 + z**2 + t**2) + c1*(x + y + z + t) + d1
gnuplot> fit f(x,y,t,u) 'mydata.dat' using 1:2:3:4 via a1, b1, c1, d1
[many lines]
After 9 iterations the fit converged.
final sum of squares of residuals : 8.7617e-31
abs. change during last iteration : -2.9774e-30
Exactly as many data points as there are parameters.
In this degenerate case, all errors are zero by definition.
Final set of parameters
=======================
a1 = 0.340413
b1 = -2.7489
c1 = 6.44678
d1 = -4.86178
So the problem is solve.
Gnuplot is open source, so looking at the source
could be a good starting point if you want to
code it by yourself. You can also start by "help fit"
in gnuplot; it speaking about the nonlinear least-squares
(NLLS) Marquardt-Levenberg algorithm.
The implementation of an equivalent algorithms don't really need all this maths knowledge(the maths are need only for speed).
All you have to do is an search algorithms (genetics for examples; changes parameters a1-d1 randomly after writing them in binnary), where the optimized search criterion is the least-squares of errors with learning samples.