6

I am using scipy optimize to get the minimum value on the following function:

def randomForest_b(a,b,c,d,e):
 return abs(rf_diff.predict([[a,b,c,d,e]]))

I eventually want to be able to get the optimal values of (a) and (b) given the arguments (c,d,e). However, just to learn how to work the optimize function, I am trying to get the optimal value of (a) given the other arguments. I have the following code:

res=optimize.minimize(randomForest_b, x0=45,args=(119.908500,65.517527,2.766103,29.509200), bounds=((45,65),))
print(res) 

And I have even tried:

optimize.fmin_slsqp(randomForest_b, x0=45,args=(119.908500,65.517527,2.766103,29.509200), bounds=((45,65),))

However, both of these just return the x0 value.

Optimization terminated successfully.    (Exit mode 0)
        Current function value: 1.5458542752157667
        Iterations: 1
        Function evaluations: 3
        Gradient evaluations: 1
array([ 45.])

The current function value is correct, however between all numbers within the bounds, the x0 does not return the minimum function value. I have the bounds set because the variable a can only be a number between 45 and 65. Am I missing something or doing something wrong? And if possible, how can I get optimal values of a and b?

Here is an example of the complete code I am using:

    from numpy import array
    import scipy.optimize as optimize
    from scipy.optimize import minimize

    a=np.random.uniform(low=4.11, high=6.00, size=(50,))
    b=np.random.uniform(low=50.11, high=55.99, size=(50,))
    c=np.random.uniform(low=110.11, high=120.99, size=(50,))
    d=np.random.uniform(low=50.11, high=60.00, size=(50,))
    pv=np.random.uniform(low=50.11, high=60.00, size=(50,))

    df=pd.DataFrame(a, columns=['a'])
    df['b']=b
    df['c']=c
    df['d']=d
    df['pv']=pv
    df['difference']=df['pv']-df['d']

    from sklearn.model_selection import train_test_split 
    y=df.loc[:, 'difference']
    x=df.iloc[:, [0,1,2,3]]
    x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.25)

    from sklearn.ensemble import RandomForestRegressor
    rf_difference = RandomForestRegressor(n_estimators = 1000, oob_score=True, 
    random_state = 0)
    rf_difference.fit(x_train, y_train) 

    def randomForest_b(a,b,c,d):
        return abs(rf_difference.predict([[a,b,c,d]]))
        
    res=optimize.minimize(randomForest_b, 
    x0=0,args=(51.714088,110.253656,54.582179), bounds=((0,6),))
    print(res)

    optimize.fmin_slsqp(randomForest_b, x0=0,args= 
    (51.714088,110.253656,54.582179), 
    bounds=((0,6),))
jtlz2
  • 7,700
  • 9
  • 64
  • 114
Dana McDowelle
  • 269
  • 1
  • 6
  • 13
  • 1
    Is your objective function a smooth (specifically, differentiable) function of `a`? What does a plot of `randomForest_b(a,b,c,d,e)` look like for, say, `a = np.linspace(40, 70, 500)`? – Warren Weckesser Oct 10 '18 at 14:46
  • @WarrenWeckesser the objective function doesn't appear to be smooth, there are distinct angles in the plot – Dana McDowelle Oct 10 '18 at 15:17
  • Could you add a plot of your function as @WarrenWeckesser described or give us a possibility to execute your code? Without this, it is hard to do more than guessing about local minima and other properties of your function which could possibly cause this. – jdamp Oct 11 '18 at 06:54
  • @jdamp, yes I added a sample code in the post above. I modified it a bit, I just want to get the value of a between 0 and 6 that will give the minimum value of the randomForest_b function given the parameters. – Dana McDowelle Oct 12 '18 at 16:36

2 Answers2

11

The function you are trying to minimize is not smooth and has also several plateaus, this can be seen by plotting randomForest_b as a function of a:

a = np.linspace(0,6,500)
args = 51.714088,110.253656,54.582179
vrandomForest_b = np.vectorize(randomForest_b,excluded=[1,2,3])
y_values = vrandomForest_b(a, *args)

fig, ax = plt.subplots(figsize=(8,6))
ax.plot(a, y_values, label='randomForest_b')
ax.axvline(0, label='Your start value', color='g', ls='--')
ax.set(xlabel='a', ylabel='randomForest_b');
ax.legend()

For non-smooth functions like yours, gradient-based optimization techniques will fail almost certainly. In this case, the starting value of 0 is on a plateau with vanishing gradient, therefore the optimization finishes immediately after one iteration.

A solution would be to use non-gradient based optimization methods, for example stochastic minimization with scipy.optimize.differential_evolution. A caveat of these methods is that they usually require more function evaluations and can take longer to finish.

This optimization method is able to find the global minimum in the example case given in your question:

rslt = optimize.differential_evolution(vrandomForest_b,
                                       args=(51.714088,110.253656,54.582179), 
                                       bounds=[(0,6)])
print(rslt)

fig, ax = plt.subplots()
ax.plot(a, y_values, label='randomForest_b')
ax.axvline(rslt.x, label='Minimum', color='red', ls='--')
ax.legend()
 fun: 0.054257768073620746 
 message: 'Optimization terminated successfully.'
 nfev: 152
 nit: 9  success: True
 x: array([5.84335956])
jdamp
  • 1,379
  • 1
  • 15
  • 22
0

Different algorithms ('method' parameter) of scipi.optimize.minimize have different expectations regarding the function passed and the other parameters passed.

The docs at https://docs.scipy.org/doc/scipy/tutorial/optimize.html#unconstrained-minimization-of-multivariate-scalar-functions-minimize helped clarify this.

I had a problem similar to yours and adding method='nelder-mead' allowed the optimizer to work. With other methods (which assume a differentiable function) it may be necessary to provide jac or hess functions in addition to the cost function being minimized, otherwise you get the behavior described for this problem. (E.g. the algorithm runs for 1 iteration and exists).

The default 'method' if unspecified selects one of the BFGS methods and thus for a non-differentiable problem, I think the Nelder-Mead method may have to be explicitly passed in the call to minimize().

BobW
  • 91
  • 6