6

I'm working with scipy.optimize.minimize to find the minimum of the RSS for a custom nonlinear function. I'll provide a simple linear example to illustrate what I am doing:

import numpy as np
from scipy import optimize

def response(X, b0, b1, b2):
    return b2 * X[1]**2 + b1 * X[0] + b0

def obj_rss(model_params, y_true, X):
    return np.sum((y_true - response(X, *model_params))**2)  

x = np.array([np.arange(0, 10), np.arange(10, 20)])
r = 15. * x[1]**2 - 32. * x[0] + 10.

init_guess = np.array([0., 50., 10.])
res = optimize.minimize(obj_rss, init_guess, args=(r, x))

print res

This yields the results:

      fun: 3.0218799331864133e-08
 hess_inv: array([[  7.50606278e+00,   2.38939463e+00,  -8.33333575e-02],
       [  2.38939463e+00,   8.02462363e-01,  -2.74621294e-02],
       [ -8.33333575e-02,  -2.74621294e-02,   9.46969972e-04]])
      jac: array([ -3.31359843e-07,  -5.42022462e-08,   2.34304025e-08])
  message: 'Optimization terminated successfully.'
     nfev: 45
      nit: 6
     njev: 9
   status: 0
  success: True
        x: array([ 10.00066577, -31.99978062,  14.99999243])

And we see that the fitted parameters 10, -32, and 15 are equivalent to those used to generate the actuals data. That's great. Now my question:

I have the understanding that the Jacobian should be an m x n matrix where m is the number of records from the X input and n is the number of parameters. Clearly I don't have that in the results object. The results object yields an array that is referred to as the Jacobian in the documentation (1 and 2), but is only one-dimensional with a number of elements equal to the number of parameters.

Further confusing the matter, when I use method='SLSQP', the Jacobian that is returned has one more element than that returned by other minimization algorithms.

. . .

My larger goal here is to be able to calculate either confidence intervals or standard errors, t-, and p-values for the fitted parameters, so if you think I'm way off track here, please let me know.

EDIT:

The following is intended to show how the SLSQP minimization algorithm yields different results in the Jacobian than the default minimization algorithm, which is one of BFGS, L-BFGS-B, or SLSQP, depending on if the problem has constraints (as mentioned in the documentation). The SLSQP solver is intended for use with constraints.

import numpy as np
from scipy import optimize


def response(X, b0, b1, b2):
    return b2 * X[1]**2 + b1 * X[0] + b0

def obj_rss(model_params, y_true, X):
    return np.sum((y_true - response(X, *model_params))**2)


x = np.array([np.arange(0, 10), np.arange(10, 20)])
r = 15. * x[1]**2 - 32. * x[0] + 10.

init_guess = np.array([0., 50., 10.])
res = optimize.minimize(obj_rss, init_guess, method='SLSQP', args=(r, x))

print res

r_pred = response(x, *res.x)

Yields results:

     fun: 7.5269461938291697e-10
     jac: array([  2.94677643e-05,   5.52844499e-04,   2.59870917e-02,
         0.00000000e+00])
 message: 'Optimization terminated successfully.'
    nfev: 58
     nit: 10
    njev: 10
  status: 0
 success: True
       x: array([ 10.00004495, -31.9999794 ,  14.99999938])

One can see that there is an extra element in the Jacobian array that is returned from the SLSQP solver. I am confused where this comes from.

khiner
  • 411
  • 5
  • 7
  • 3
    The Jacobian is the derivative of `obj_rss` with respect to `model_params`. `model_params` is a vector with length 3 (see `init_guess`), and `obj_rss` returns a scalar, so in this case the Jacobian is a vector of length 3. It holds `[d(obj_rss)/d(model_params[0]), d(obj_rss)/d(model_params[1]), d(obj_rss)/d(model_params[2])]`, where `d(foo)/d(bar)` means the derivative of `foo` with respect to `bar`. – Warren Weckesser Oct 05 '16 at 21:25
  • Thanks, Warren. I think I was a bit confused on the nature of the Jacobian. This helps. Would you happen to know why the SLSQP returns a vector that has one additional element (i.e., number of parameters + 1)? I'm not very familiar with the different minimization routines. – khiner Oct 11 '16 at 19:37
  • 1
    Where do you see this? You have `init_guess = np.array([0., 50., 10.])` , which has length 3, so there are 3 parameters. I was wondering why you used only `X[0]` and `X[1]` in `response()`. If `X` is supposed to be a vector of length 2, then `init_guess` should also have length 2. – Warren Weckesser Oct 11 '16 at 20:13
  • I've edited the post to show the output while explicitly using the SLSQP solver. In regard to X[0] and X[1] - they are each an array of length 10 that are supposed to simulate independent variables that are predictors of the response variable. The response function calculates the predicted value based on the input regressors, then the obj_rss calculates the residual sum of squares between that vector output and the true target data. The minimization iterates over the coefficients (b) to identify the best-fit values, so the initial guess should be for the three coefficients in the model. – khiner Oct 12 '16 at 14:31
  • Ah, sorry, when I came back to the question after a few days, I didn't look closely enough at the code. So the question remains, why do you get `jac: array([ 2.94677643e-05, 5.52844499e-04, 2.59870917e-02, 0.00000000e+00])` when `x` has length 3. – Warren Weckesser Oct 12 '16 at 16:47
  • 2
    I created a scipy issue for this: https://github.com/scipy/scipy/issues/6676 – Warren Weckesser Oct 12 '16 at 17:25
  • FYI: [The issue has been fixed](https://github.com/scipy/scipy/pull/6679) in the development version of scipy. – Warren Weckesser Oct 20 '16 at 20:44
  • Thanks, this has been both instructive and helpful. – khiner Oct 26 '16 at 14:43

0 Answers0