0

I'm trying to understand the documentation on the scipy.optimize.least_squares function:

https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.least_squares.html

The possibilities for inputs are a bit messy, and the documentation is not super clear on all of them. It says that the argument jac can be a callable returning a LinearOperator.

  1. I suppose, the provided LinearOperator is supposed to represent the Jacobian as a linear operator mapping the variable shifts to the residual shifts. Or is it the other way round?

  2. Which operations do I need to implement for the LinearOperator? Only matvec, or matmat as well?

  3. Does providing a LinearOperator instead of the full Jacobi matrix actually speed up anything? Or is the full matrix built from the operator anyways? (And yes, in my example, evaluating the LinearOperator is much faster than building the whole Jacobi matrix.)

Andi Bauer
  • 101
  • 1

1 Answers1

1
  1. least_squares expects Jacobian of original functions (the variable shifts to the value shifts). If you know Jacobian for residuals, you probably should switch to fmincon or other optimization routine and work with residuals. One of the main advantages of least squares approach is the opportunity to efficiently talk on the language of original functions instead of residuals.
  2. least_squares invokes matmat, matvec, rmatvec, but LinearOperator itself can implement matmat from matvec, if only matvec is provided (and vice versa). But it cannot implement rmatvec without rmatvec or rmatmat.
  3. Most of the time only the result of J(x).T.dot(f) is needed and the full matrix is not held. Yet I noticed some numerical difference between matrix and operators Jacobians.
Askold Ilvento
  • 1,405
  • 1
  • 17
  • 20