3

There might be an answer for that floating around somewhere but I wasn't able to find it.

I want to minimize formula with variable X >= 0 and 1st derivative matrix D, so that X is smooth in column-direction and with relatively large data.

In the past I have used various ways of solving this (e.g. with scipy.optimize.lsq_linear or pylops) and now wanted to try out cvxpy with the generic approach below (using SCS, because the problem wouldn't fit into memory otherwise):

def cvxpy_solve_smooth(A, B, lambd=0):
    D = get_D(B.shape[1])
    X = cp.Variable((A.shape[1], B.shape[1]))
    cost1 = cp.sum_squares(A @ X - B)
    cost2 = lambd * cp.sum_squares(X @ D.T)
    constr = [0 <= X]
    prob = cp.Problem(cp.Minimize(cost1 + cost2), constr)
    prob.solve(solver=cp.SCS)
    return X.value

def get_D(n):
    D = np.diag(np.ones(n - 1), 1)
    np.fill_diagonal(D, -1)
    D = D[:-1]
    return D

However, this is considerably slower than scipy.optimize.lsq_linear with sparse matrices. What can I do in terms of problem formulation, solver options, cvxpy advanced features etc. to improve performance?

zeawoas
  • 446
  • 7
  • 18

1 Answers1

2

SCS-based

Check what SCS is configured with in your option-free call. I suspect it's started in direct-mode and you should also try the indirect-mode (use_indirect=True). Turning on verbosity (verbose=True) should show you what's currently being used

Alternatives

This looks like smooth unconstrained optimization with bound-constraints only. I doubt, that SCS is the right approach here (it's just too powerful; too general).

I would fire up L-BFGS-B (which supports bound-constraints) (reference impl), e.g. through scipy. For a prototype, you might get away with automatic numerical-diff, but for your final version, you should provide a customized gradient then. I suspect, it will be much more efficient than SCS. But that also depends on the accuracy you are striving for.

sascha
  • 32,238
  • 6
  • 68
  • 110
  • thanks for the suggestions. in my case `use_indirect=True` increased runtime 6 fold. regarding L-BFGS-B: the dataset itself is actually not that large, just the stacked formulation of the linear least squares problem required for smoothing in column-direction wouldn't fit into memory (with one block matrix `A` per column of `X`). Thus, if formulated as non-linear optimization problem (e.g. for `scipy.optimize.minimize`), I could also go for faster algorithms. However, I guess these would most probably still be slower than linear least squares or a QP formulation. – zeawoas May 14 '20 at 21:22
  • I still think L-BFGS-B will be hard to beat. lsq_linear with sparse matrices is an IPM (at least reading the short scipy docs, maybe i'm wrong: does not scale well compared to lbfgs) and QP solvers are usually simplex-type (many iterations needed) or IPMs (again: scaling).You also pay for the general support of constraints (you don't need). You still can store a sparse-matrix and calculate the gradient also with sparse-math when using L-BFGS-B. – sascha May 14 '20 at 21:31
  • Finally had a go at L-BFGS-B and it was by far the fastest solver available via `scipy.optimize.minimize` and indeed faster than `lsq_linear` for large problems. Do you have any tips on how to easily generate a customized gradient/jacobian for the objective described in the question? – zeawoas May 20 '20 at 13:01