How to curve fit data to a known formula with two independence variables?

Question

Suppose I have a nonlinear formula like y = (ax + bx^2 + cx^3) * dx1^2

Where a,b,c,d are coefficients to be found and x and x1 are data in a table I am trying to fit.

I need an optimization algorithm I can write in code (e.g C or Delphi) to run through an iteration to get reasonable coefficients a,b,c,d.

Don't want to use Matlab or packages as this must be a stand alone program. Reference to a delphi or active X unit is helpful. Don't mind to pay for the software if I can use it freely.

So, your data table contains multiple triples `(x, dx1, y)` and you want to find the coefficients `a, b, c`? This is perfectly linear (in the unknowns). Thus, you can formulate a [linear least squares](https://en.wikipedia.org/wiki/Linear_least_squares) problem and solve it. — Nico Schertler, Oct 23 '18 at 06:26
Note that if the last operator is really a product you do not need `d`, that is already in `a`, `b`, and `c` — mikuszefski, Oct 23 '18 at 08:48
I suggest researching "multiple linear regression" in the programming language of choice, you can then format x, x^2, and (x^3 * x1^2 in the code and linearly regress that to get a, b, and c. An offset parameter is often useful, for that format 1.0 (offset times 1) in the code. — James Phillips, Oct 23 '18 at 11:20

MvG · Answer 1 · 2018-10-26T06:51:26.610

3

Your problem is linear in a, b, c and d even if it is cubic in the data. Therefore I would suggest formulating this as an ordinary linear least squares problem. Allow me to rename your x1 to z. The idea is this: you have

ax_i + bx_i² + cx_i³ + dz_i² ≈ y_i

for some i∈{1,2,3…n}. You can write that as an approximate matrix equation:

⎡x₁ x₁² x₁³ z₁²⎤   ⎡a⎤   ⎡y₁⎤
⎢x₂ x₂² x₂³ z₂²⎥   ⎢b⎥   ⎢y₂⎥
⎢x₃ x₃² x₃³ z₃²⎥ ∙ ⎢c⎥ ≈ ⎢y₃⎥
⎢ ⋮ ⋮  ⋮  ⋮ ⎥   ⎣d⎦   ⎢⋮⎥
⎣xₙ xₙ² xₙ³ zₙ²⎦         ⎣yₙ⎦

Or more shortly

M ∙ X ≈ Y

Now you multiply both sides with the transpose of that matrix M:

M^T ∙ M ∙ X = M^T ∙ Y

Notice that I changed from ≈ to = because the least squares solution will satisfy this modified equation exactly (for lengthy reasons I don't want to go into here). This is a simple 4×4 system of linear equations. Solve using common techniques (such as Gaussian elimination) to find X=(a,b,c,d).

If n is large you can even compute (M^T ∙ M) and (M^T ∙ Y) on the fly without ever storing M itself. That way 4×4+4=20 numbers will be all the memory you need to maintain between input records. Actually (M^T ∙ M) is symmetric so 10 numbers are enough for the matrix, 14 in total.

edited Oct 26 '18 at 06:51

answered Oct 23 '18 at 21:12

MvG

57,380
22
148
276

uhhh, that version is slightly oversimplified. – mikuszefski Oct 24 '18 at 09:21
@mikuszefski: Oversimplified in what way? I left out all the rationale as to *why* this works, or the discussion as to what this actually optimizes. But for the question at hand as to “how do I get reasonable coefficients” I feel it does the job. But I'l be very interested to read your take on this, so feel free to write an independent or complimentary answer. – MvG Oct 24 '18 at 15:18
well, after teaching for more than a decade I can only say that the right result does not mean that the answer is right. Least square means that you have a vector `E` such that `M k = Y + E` where `k` is the coefficient vector. After minimizing `E_T E` with respect to `k` you get `M_T M k_opt = M_T Y`. In this sense your answer is right but the reason is...uhhh...oversimplified....Although, I guess that you know that. – mikuszefski Oct 25 '18 at 06:34
@mikuszefski: I didn't even mean to claim to *have* an explanation. Just instructions, recipe style. In terms of explanation I prefer the geometric view, of orthogonal projection from Y onto the space spanned by M, but that would take some more work to elaborate, and would not be the right mindset for everyone either. – MvG Oct 25 '18 at 15:55
1

..agreed...I just was irritated by the statement *Now you multiply both sides with the transpose of that matrix M, and you can aim for equality* as this sort of implies that this is the reason/right way of doing it; never mind. – mikuszefski Oct 26 '18 at 06:15
@mikuszefski: Thanks, now I see how that phrase was misleading. I changed that bit, hope this is less ambiguous now. – MvG Oct 26 '18 at 06:55

How to curve fit data to a known formula with two independence variables?

1 Answers1