This isn't a full solution, more like some thoughts that may lead to one. Also, someone please double-check my math.
Variables and test data
First, let's begin by defining some test data. While doing so, I transpose the a1a2a3a4
matrix because it will prove more conventient. Plus, I'm renaming it to a14
because I cannot be bothered writing the whole thing all the time, sorry.
import numpy as np
from numpy import linalg, random
from scipy import optimize
a14 = random.uniform(-1., 1., (500, 4)) * (0.25, 0.75, 1., 2.)
at = random.uniform(-1., 1., 500)
xpoints = np.cumsum(random.random(500))
initial_guess = np.full(4, 0.25)
Target
Okay, here is my first slight point of confusion: OP described the target as the "area difference" between two graphs. X coordinates are given by xpoints
. Y coordinates of the target are given as at
and we search coeffs
so that a14 @ coeffs
gives Y coordinates close to the target.
Now, when we talk "area under the graph", my first instinct is piecewise linear. Similar to the trapezoidal rule this gives us the formula sum{(y[i+1]+y[i]) * (x[i+1]-x[i])} / 2
.
We can reformulate that a bit to get something of the form (y[0]*(x[1]-x[0]) + y[1]*(x[2]-x[0]) + y[2]*(x[3]-x[1]) + ... + y[n-1]*(x[n-1]-x[n-2])) / 2
. We want to precompute the x
differences and scaling factor.
weights = 0.5 * np.concatenate(
((xpoints[1] - xpoints[0],),
xpoints[2:] - xpoints[:-2],
(xpoints[-1] - xpoints[-2],)))
These factors are different from the ones OP used. I guess they are integrating a piecewise constant function instead of piecewise linear? Anyway, switching between these two should be simple enough so I leave this here as an alternative.
Loss function
Adapting OP's loss function to my interpretation gives
def loss(coeffs):
absdiff = abs(at - (a14 * coeffs).sum(axis=1))
return (absdiff * weights).sum()
As I've noted in comments, this can be simplified by using matrix vector multiplications.
def loss(coeffs):
return abs(a14 @ coeffs - at) @ weights
This is the same as abs((a14 * weights[:, np.newaxis]) @ coeffs - (at * weights)).sum()
which in turn makes it obvious that we are talking about minimizing the L1 norm.
a14w = a14 * weights[:, np.newaxis]
atw = at * weights
def loss(coeffs):
return linalg.norm(a14w @ coeffs - atw, 1)
This isn't a huge improvement as far as computing the loss function goes. However, it goes a long way towards pressing our problem into a regular pattern.
Approximate solution
As noted in comments, the abs()
function in the L1 norm is poison for optimization because it is non-differentiable. If we switch to an L2 norm, we basically have a least squares fit with additional constraints. Of course this is somewhat unsound as we start solving a different, if closely related, problem. I suspect the solution would be "good enough"; or it could be the starting solution that is then polished to conform with the actual target.
In any case, we can use scipy.optimize.lsq_linear
as a quick and easy solver. However, that one does not support linearity constraints. We can mimic that with a regularization parameter.
def loss_lsqr(coeffs):
return 0.5 * linalg.norm(a14w @ coeffs - atw)**2
grad1 = a14w.T @ a14w
grad2 = a14w.T @ at
def jac_lsqr(coeffs):
return grad1 @ coeffs - grad2
regularization = loss_lsqr(initial_guess) # TODO: Pick better value
a14w_regular = np.append(a14w, np.full((1, 4), regularization), axis=0)
atw_regular = np.append(atw, regularization)
approx = optimize.lsq_linear(a14w_regular, atw_regular, bounds=(0., 1.))
# polish further, maybe?
bounds = ((0., 1.),) * 4
constraint = optimize.LinearConstraint(np.ones(4), 1., 1.)
second_guess = approx.x
second_guess /= linalg.norm(second_guess) # enforce constraint
exact = optimize.minimize(
loss_lsqr, second_guess, jac=jac_lsqr,
bounds=bounds, constraints=constraint)