I'm trying to fit a sigmoid curve onto a small set of points, basically generating a probability curve from a set of observations. I'm using scipy.optimize.curve_fit
, with a slightly modified logistic function (so as to be bound completely within [0,1]). Currently I have had the greatest success with the dogbox method, and an exact tr_solver.
When I attempt to run the code, for certain data points it will raise:
ValueError: `x0` violates bound constraints.
I did not run into this issue (using the same code and data) until I updated to the most recent version of numpy/scipy (numpy 1.17.0, scipy 1.3.1), so I believe it to be a result of this update (I cannot downgrade, as other libraries that I require for other aspects of this project require these versions)
I'm running this on a large dataset (N ~15000), and for very specific values the curve fit fails, claiming that the initial guess is outside of the bound constraints. This is not the case, and even checking quickly via the print statement before the curve fit in the provided example confirms this.
At first I had thought that it was a numpy precision error and that a value this small was considered to be out of bounds, but altering it slightly or providing a new, arbitrary number of a similar magnitude does not cause a ValueError. Additionally, other failed values are as big as ~1e-10, so I assume it must be something else.
Here is an example that fails for me every time:
import numpy as np
import scipy as sp
from scipy.special import expit, logit
import scipy.optimize
def f(x,x0,g,c,k):
y = c*expit(k*10.*(x-x0)) + g*(1.-c)
return y
# x0 g c k
p0 = np.array([8.841357069490852e-01, 4.492363462957287e-19, 5.547073496706608e-01, 7.435378446218519e+00])
bounds = np.array([[-1.,1.], [0.,1.], [0.,1.], [0.,20.]])
x = np.array([1.0, 1.0, 1.0, 1.0, 1.0, 0.8911796599834791, 1.0, 1.0, 1.0, 0.33232919909076103, 1.0])
y = np.array([0.999, 0.999, 0.999, 0.999, 0.999, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001])
s = np.array([0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9])
print([pval >= b[0] and pval <= b[1] for pval,b in zip(p0,bounds)])
fit,cov = sp.optimize.curve_fit(f,x,y,p0=p0,sigma=s,bounds=([b[0] for b in bounds],[b[1] for b in bounds]),method='dogbox',tr_solver='exact')
print(fit)
print(cov)
Here is the specific error stack, everything after the above call to curve fit.
File "C:\Users\user\AppData\Local\Programs\Python\Python36\lib\site-packages\scipy\optimize\minpack.py", line 763, in curve_fit
**kwargs)
File "C:\Users\user\AppData\Local\Programs\Python\Python36\lib\site-packages\scipy\optimize\_lsq\least_squares.py", line 927, in least_squares
tr_solver, tr_options, verbose)
File "C:\Users\user\AppData\Local\Programs\Python\Python36\lib\site-packages\scipy\optimize\_lsq\dogbox.py", line 310, in dogbox
J = jac(x, f)
File "C:\Users\user\AppData\Local\Programs\Python\Python36\lib\site-packages\scipy\optimize\_lsq\least_squares.py", line 874, in jac_wrapped
kwargs=kwargs, sparsity=jac_sparsity)
File "C:\Users\user\AppData\Local\Programs\Python\Python36\lib\site-packages\scipy\optimize\_numdiff.py", line 362, in approx_derivative
raise ValueError("`x0` violates bound constraints.")
ValueError: `x0` violates bound constraints.
If anyone has any insight as to what may be causing this, I would greatly appreciate the help! I did some searching and couldn't find any answers that may relate to this scenario, so I decided to open this question up. Thanks!
EDIT 9/9/19:
np.__version__
is 1.17.2 and sp.__version__
is 1.3.1, when I originally posted this I was on numpy 1.17.0 but upgrading has not fixed the issue. I'm running this on Python 3.6.6 on 64-bit Windows 10.
If I change either the second or fourth bound to be +/-np.inf (or change both), then the code does in fact complete -- but I am still unsure how my x0 is invalid (and I still need to have the fit bounded to these values)
EDIT: 1/22/20
upgraded np.__version__
to 1.18.1 and sp.__version__
to 1.4.1, to no avail. I have opened an issue on the scipy github repository for this error. However, it seems that they are also unable to reproduce the issue and therefore cannot address it.