2

I'm trying to run some curve fitting by using Scipy's Optimize. I don't do this with polyfit because I want to ensure the curves are monotonic, given the relationship I have. So suppose I have the following relationship between sulphur and temperature:

sulphur = array([  71.,   82.,   50.,  113.,  153.,  177.,  394., 1239., 2070., 2662., 3516., 4000., 4954., 6314.])
temperature = array([ 70.,  90., 140., 165., 210., 235., 265., 330., 350., 390., 410., 435., 540., 580.])

And I want to find a curve to fit this relationship which is monotonically increasing.

I have found bits of code around, and this is what I have: I calculate the polynomial and use it in the objective function, and I constraint the first derivative to be positive for every point. I also use polyfit values as x0 in order to speed up operations:

x = sul
y = temperature
initial = list(reversed(np.polyfit(sul, temperature, 3)))
Nfeval = 1

def polynomial(p, x):
    return p[0]+p[1]*x+p[2]*x**2+p[3]*x**3

def constraint_1st_der(p, x):
    return p[1]+2*p[2]*x+3*p[3]*x**2

def objective(p, x):
    return ((polynomial(p, x) - y)**2).sum()

def f(p):
    return objective(p, x)

def callback(p):
    global Nfeval
    print(Nfeval, p, constraint_1st_der(p, x))
    Nfeval += 1

cons = {'type' : 'ineq', 'fun' : lambda p : constraint_1st_der(p, x)}

res = optimize.minimize(f, x0=np.array(initial), method='SLSQP', constraints=cons, callback = callback)

However, optimize always returns:

     fun: 4.0156824919527855e+23
     jac: array([0.00000000e+00, 0.00000000e+00, 7.02561542e+17, 3.62183986e+20])
 message: 'Inequality constraints incompatible'
    nfev: 6
     nit: 1
    njev: 1
  status: 4
 success: False
       x: array([ -111.35802358,  1508.06894349, -2969.11149743,  2223.26354865])

I tried normalizing (say sul_norm = sul / max(sul) works), and by doing this the optimization goes on successfully, but I'd like to avoid doing that (I have to invert the function at one point, and then it can get messy with coming back to the original value). Also, it seems to me the relationship is pretty basic and the values between temperature and sulphur are not so extremely on a different scale. What could it be? Thanks!

giga
  • 307
  • 2
  • 5
  • 15
  • 1
    Please provide a reproducible example. In `(polynomial(p, x)-t)**2`, `t` is not defined. – SuperKogito Jul 08 '19 at 13:01
  • Sorry, it's obviously (polynomial(p, x)-y)**2, I forgot to change that label! Thanks. – giga Jul 08 '19 at 13:46
  • 2
    Similar to https://stackoverflow.com/questions/33511284/scipy-optimize-minimize-method-slsqp-ignores-constraint perhaps ? – Ben Jul 09 '19 at 14:35

2 Answers2

3

You got various limiting issues here: First the solver choice, which highly affects what kind of optimization you can do (constrained, bounded, etc.). On top of that, in your case, you are interested in the parameters and dispose a predefined set (x, y), so you should process your data in a multi-dimensional fashion for computations related to (x,y). However, the constraints definition you used are to my knowledge suitable for one dimensional outputs. Therefore, as this SO-question suggests, the use of the gradient is a good idea.

Unfortunately, when testing that on your case, the results were not convincing to me, despite error free code-runs. After tweaking your code a bit, I managed to find a decent workaround but I am not sure if is the best out there. My idea is to use the Nelder-Mead Solver and using an equality constraint make sure sure your derivatives vector is all positive. The code is the following:

import numpy as np
from scipy import optimize
import matplotlib.pyplot as plt

np.set_printoptions(precision=3)
# init data
sulphur     = np.array([ 71.,  82.,  50., 113., 153., 177., 394., 1239., 2070., 2662., 3516., 4000., 4954., 6314.])
temperature = np.array([ 70.,  90., 140., 165., 210., 235., 265.,  330.,  350.,  390.,  410.,  435.,  540.,  580.])

# init x and y
x = sulphur
y = temperature

# compute initial guess
initial = list(reversed(np.polyfit(x, y, 3)))
Nfeval  = 1

# define functions
polynomial = lambda p, x: p[0] + p[1]*x +   p[2]*x**2 +   p[3]*x**3
derivative = lambda p, x:        p[1]   + 2*p[2]*x    + 3*p[3]*x**2 

def constraint(p):
    if (derivative(p, x) > 0).all() : return 0
    else                            : return -1

def callback(p):
    global Nfeval
    print("Evaluations nbr: %3s | p: %5s |cons: %3s" % (Nfeval,
                                                        p, 
                                                        constraint(p)))
    Nfeval += 1

func = lambda p: np.linalg.norm(y - polynomial(p, x))
cons = {'type' : 'eq', 'fun' : constraint}
res  = optimize.minimize(func, 
                         x0          = np.array(initial), 
                         method      ='Nelder-Mead', 
                         constraints = cons,
                         callback    = callback)
print('----------------------------------------------------------------------------------------')
print(res)

# plot results
f   = plt.figure(figsize=(10,4))
ax1 = f.add_subplot(131)
ax2 = f.add_subplot(132)
ax3 = f.add_subplot(133)

ax1.plot(x, y,'ro', label = 'data')
ax1.plot(x, polynomial(res.x,   x), label = 'fit using optimization', color="orange")
ax1.legend(loc=0) 
ax2.plot(x, derivative(res.x, x), label ='derivative')
ax2.legend(loc=0)
ax3.plot(x, y,'ro', label = 'data')
ax3.plot(x, polynomial(initial, x), label = 'fit using polyfit', color="green")
ax3.legend(loc=0)
plt.show()

Output:

.
.
.
Evaluations nbr:  95 | p: [ 1.400e+02  1.830e-01 -4.203e-05  3.882e-09] |cons:   0
Evaluations nbr:  96 | p: [ 1.400e+02  1.830e-01 -4.203e-05  3.882e-09] |cons:   0
Evaluations nbr:  97 | p: [ 1.400e+02  1.830e-01 -4.203e-05  3.882e-09] |cons:   0
Evaluations nbr:  98 | p: [ 1.400e+02  1.830e-01 -4.203e-05  3.882e-09] |cons:   0
----------------------------------------------------------------------------------------
 final_simplex: (array([[ 1.400e+02,  1.830e-01, -4.203e-05,  3.882e-09],
       [ 1.400e+02,  1.830e-01, -4.203e-05,  3.882e-09],
       [ 1.400e+02,  1.830e-01, -4.203e-05,  3.882e-09],
       [ 1.400e+02,  1.830e-01, -4.203e-05,  3.882e-09],
       [ 1.400e+02,  1.830e-01, -4.203e-05,  3.881e-09]]), array([159.565, 159.565, 159.565, 159.565, 159.565]))
           fun: 159.5654373399882
       message: 'Optimization terminated successfully.'
          nfev: 168
           nit: 99
        status: 0
       success: True
             x: array([ 1.400e+02,  1.830e-01, -4.203e-05,  3.882e-09])

Plots:

enter image description here

SuperKogito
  • 2,998
  • 3
  • 16
  • 37
  • Thank you SuperKogito, I will give it a try straight away. I focused so much on why SLSQP didn't work, that I didn't check on other optimizers. – giga Jul 09 '19 at 07:57
  • Hello SuperKogito, sorry for the late reply, I was extensively testing this. I encountered a set of values where the constraint stays at -1, but the optimizer still says it optimized successfully: sulphur = np.array([ 0.13, 0.24, 0.52, 0.61, 0.72, 0.77, 0.81, 0.88, 0.89, 0.92, 0.92, 0.93, 0.95, 0.96]) temperature = np.array([ 70., 90., 140., 165., 210., 235., 265., 330., 350., 390., 410., 435., 540., 580.]) – giga Jul 17 '19 at 15:57
  • It has been a while since I looked into this. Please provide an example with the case you are talking about. Last time, I recall coming across a case with some weird output. Therefore my recommendation is to use [scipy.optimize.basinhopping](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.basinhopping.html#scipy.optimize.basinhopping) as it is more general and the chances of getting stuck with a local minimum is lower imo. – SuperKogito Jul 17 '19 at 16:17
1

The problem lies in the Bounds. For extremely small floats , the binary representation of some numbers doesn't exist. Internally the optimizer is comparing for instance 99.9999999 -100 >0 and determining they are not equal (bound not satisfied) if your constraint was X-Y==.0 .After reaching the maxEval it concludes that there's no combination that satisfies the constraint. To avoid this and if is not a problem in the solution change the bounds to (0, 0.000001) instead of (0.,0.)