How to find a function that can approximate another blackbox function programmatically?

Question

I have two functions

  m1 = f1(w, s)
  m2 = f2(w, s)

f1() and f2() are all blackboxs. Given w and s, I can get m1 and m2.

Now, I need to design or find a function g, such that

   m2' = g(m1)

Also, the difference between m2 and m2' must be minimized.

The w and s are all stochastic process.

How can I find such a function g()? What knowledge domain does this belong to ?

I believe this belongs on [math.stackexchange](http://math.stackexchange.com/). — 500 - Internal Server Error, Feb 13 '14 at 15:14
Do you have an example? Can you produce a scatter plot of pairs (m1, m2) from a sampling of points (w,s)? If that diagram is not close to a curve, then there is no hope of finding g. If that diagram does not look like a graph of a function, e.g., multiple values of m2 over the same m1, then finding g is also complicated. If it looks good, polynomial regression might already give a sufficient g. — Lutz Lehmann, Feb 13 '14 at 15:55
Sounds related to regression problems. Can you invoke f1,f2 several number of times? — amit, Feb 13 '14 at 15:59
@500-InternalServerError If any to stats.SE, but if I understood this problem correctly - it is on scope for SO as well, as it is solveable by a program rather easily when you understand what you should do. — amit, Feb 13 '14 at 16:15
This question appears to be off-topic because it is about mathematics, not programming. — chepner, Feb 13 '14 at 16:19
@chepner Did you read my comment or answer? It seems to be perfectly about programming if I understood it correctly. It is solveable not by math, but by an actual program (OLS, for example). — amit, Feb 13 '14 at 16:27
Just because something is solvable by a program does not make it a programming question. You are asking about the mathematical background needed to solve the problem, not any specific programming questions regarding an implementation. — chepner, Feb 13 '14 at 16:37
@chepner This is not the case. He was not asking "why OLS work?" He wanted to find an algorithm. There is no much difference (conceptually) between this question and "how can I sort an array?" or "How to find shortest path in a maze?" or "How to parse a string using regex"?, which are fine (though probably dupe) question, and perfectly on scope — amit, Feb 13 '14 at 17:35

amit · Answer 1 · 2014-02-13T18:27:13.573

3

Assuming you can invoke f1,f2 as many times as you want - this can be solved using regression.

Set a training set: (w_1,s_1,m2_1),...,(w_n,s_n,m2_n).
'Convert' the set to the parameters of g: (m1_1,m2_1),...,(m1_n,m2_n).
Create your 'base functions'. For example, for base functions of polynoms up to degree 3 the the 'modified' training set will be (1,m1_1,m1_1^2,m1_1^3,m2_1), ... It is easy to generalize it to any degree of polynom or any other set base functions.
Now you have yourself a problem which can be solved by linear regression using ordinary least squares (OLS)

However, note that for some functions, this might be impossible to ~~calculate~~ find a good model to fit, since you lose data when you reduce the dimensionality from 2 (w,s) to 1 (m1).

Matlab code snap (poor choice of functions):

%example functions
f = @(w,s) w.^2 + s.^3 -1;
g = @(w,s) s.^2 - w + 2;
%random points for sampling
w = rand(1,100);
s = rand(1,100);
%the data
m1 = f(w,s)';
m2 = g(w,s)';
%changing dimension:
d = 5;
points = size(m1,1); 
A = ones(points,d);  

for jj=1:d
    A(:,jj) = (m1.^(jj-1))';

end
%OLS:
theta = pinv(A'*A)*A'*m2;

 %new point:
 w = rand(1,1);
 s = rand(1,1);
 m1 = f(w,s);
%estimate the new point:
A = ones(1,d);
for jj=1:d
    A(:,jj) = (m1.^(jj-1))';
end
%the estimation:

estimated = A*theta

%the real value:
g(w,s)

edited Feb 13 '14 at 18:27

answered Feb 13 '14 at 16:12

amit

175,853
27
231
333

I think "can be solved by linear regression" is a bit of an overstatement. There is no guarantee this will work well (even though it "usually" might). – oseiskar Feb 13 '14 at 18:23
@oseiskar As I stated explicitly as well. Linear Regression will do 'its best' to fit the data, but since data is lost when the dimensionality was reduced - there is no guarantee the generated model will fit. – amit Feb 13 '14 at 18:25
What I meant to say was that it is not guaranteed to work well even if the dimensionality was not reduced. Take m1 = sin(s), m2 = cos(s), s ~ Uniform[0,2pi) for example. – oseiskar Feb 13 '14 at 18:40
@amit, about " for jj=1:d ; A(:,jj) = (m1.^(jj-1))'; " , when jj = 1, all emlements of m1 will become 1s. So, in this way, A will be a 1s matrix. It is typo ? Thanks ! – user2420472 Mar 01 '14 at 22:02
@user2420472 No. In linear regression - we actually one a 'column' of ones. Note the matrix will have the first columns of all ones, but the rest of the elements will be dependent on the data. This allows us to find model in the form of a0 + a1x + a2x^2 + ... anx^n. Without the ones, we could not get the `a0`. – amit Mar 01 '14 at 23:47
@amit, I think " %new point: w = rand(1,1); s = rand(1,1); " should be "%new point: w = rand(1,100); s = rand(1,100);", right ? – user2420472 Mar 02 '14 at 00:51
@user2420472 This is testing for a single point only. it could work for 100 points as well though – amit Mar 02 '14 at 07:21
@amit, about " w = rand(1,1); ", if you run the program , you will get error (dimensions do not match), about "theta = pinv(A'*A)*A'*m2;", you use m2 to get theta. But, I need to use m1 to predict m2 so that I can get approximate values of m2 without computing m2. Any help would be appreciated. Thanks ! – user2420472 Mar 02 '14 at 14:17

oseiskar · Answer 2 · 2014-02-13T18:19:50.843

This kind of problems are studied in fields such as statistic or inverse problems. Here's one way to approach the problem theoretically (from the point of view of inverse problems):

First of all, it is quite clear that in the general case, the function g might not exists. However, what you can (try to) compute, given that you (assume to) know something about the statistics of w and s, is the posterior probability density p(m2|m1), which can then be used to compute estimators for m2 given m1, for instance, a maximum a posteriori estimate.

The posterior density can be computed using Bayes' formula:

p(m2|m1) = (\int p(m1,m2|w,s)p(w,s) dw ds) / (\int p(m1|w,s) dw ds)

which, in this case, might be (theoretically) nasty to apply since some of the involved maginal probability densities are singular. The best way to proceed numerically depends on the additional assumptions you can do on the statistics of w and s (e.g., Gaussian) and the functions f1, f2 (e.g., smooth). There is no silver bullet.

amit's OLS solution is probably a good starting point. Just be sure to sample from the correct distributions for w and s.

How to find a function that can approximate another blackbox function programmatically?

2 Answers2