0

I have got 2 (simulation) data sets and 2 (experimental) reference data sets.

As the simulation was performed numerically, no method/function is known, just the simulation data is available.

The 2 data sets share parameters that I want to extract by fitting simulation to reference data.

I did not find any python functionality to perform such a fitting / minimization / optimization using just data sets instead of a fitting function / model.

Concretely: I have the following: two equations:

e1 = a * s1 + b * t1 + c * u1 and

e2 = a * s2 + b * t2 + c * u2 and

I want to figure out the parameters a, b, c.

e1, e2 are experimental NxN np.arrays (can be visualized in a heatmap or can be considered as f(x,y) ) and

s1, s2, t1, t2, u1, u2 are MxM np.arrays containing simulation data.

I want left and right hand sides of the equations (heatmaps) to be as similar as possible and also consider both equations alike to get to know a, b, c.

It would take effort to make N = M but it could be done. I know, I have to use two models but I only know how to pass matching 1xN experimental and simulation arrays to the models.

Shudras
  • 117
  • 2
  • 8
  • Hi, can you explain what you want to fit if there is no function to fit. What is your expected outcome? – mikuszefski Jul 30 '19 at 08:10
  • @ mikuszefski I elaborated a little bit more. – Shudras Jul 30 '19 at 21:08
  • Hi, so the trick would be to define a distance function from which you get a value how close your heat map A is to heat map B. The fact that n != m complicates things, but interpolation might work out for you. Once you have the distance function the thing is straight forward. How to make the distance function I can't tell from the information I have. A standard quadratic error would be good start, I guess. For interpolation you might want to look into `scipy.interpolate.interp2d` – mikuszefski Jul 31 '19 at 15:37

1 Answers1

0

I wrote a wrapper around scipy called symfit which makes this fitting this kind of problem straightforward, so I think you might be interested in using it. Using symfit for your problem, you could write

from symfit import parameters, variables, Fit, Model

e1, e2, s1, s2, t1, t2, u1, u2 = variables('e1, e2, s1, s2, t1, t2, u1, u2')
a, b, c = parameters('a, b, c')

model = Model({
    e1: a * s1 + b * t1 + c * u1,
    e2: a * s2 + b * t2 + c * u2,
})

fit = Fit(model, u1=u1data, s1=s1data, ...)
fit_result = fit.execute()
print(fit_result)

See the documentation for more information. Good luck!

tBuLi
  • 2,295
  • 2
  • 16
  • 16
  • Thank you. Does this work for shifted data sets? I. e. if e1 data points are shifted and a * s1 + b * t1 + c * u1 are not? Further, how do I avoid acceptance of NaN-values, i.e. force standard deviations. – Shudras Aug 05 '19 at 19:35
  • You can always provide standard deviations to fit as `sigma_e1=...`, see the docs for Fit. And in principle symfit doesn't care about the data, as long as the left hand side and the right hand side of each equation can be broadcasted together by numpy. Reasoning if the answer makes sense is up to you :). – tBuLi Aug 06 '19 at 09:23
  • Does symfit also provide functionality to take into account misalignment between reference data and data to be fitted? I.e. if e1[i] does not correspond to s1[i] but maybe e1[i] corresponds to s1[i + x]. (Comparison of central subsets.) – Shudras Aug 06 '19 at 19:26
  • Not built in, so you can either build a for loop and see yourself which is best, or you can implement a CallableNumericalModel which has the integer `x` as an extra parameter. However, fitting with integer parameters is hard so I'm not guaranteeing that it'll work. Bottom line is, you'll have to play around with it yourself. Good luck! – tBuLi Aug 12 '19 at 11:00