I am trying to solve the following one-dimensional least absolute difference (LAD) optimization problem
and I am using bisection method to find the best beta (a scalar). I have the following code:
import numpy as np
x = np.random.randn(10)
y = np.sign(np.random.randn(10)) + np.random.randn(10)
def find_best(x, y):
best_obj = 1000000
for beta in np.linspace(-100,100,1000000):
if np.abs(beta*x - y).sum() < best_obj:
best_obj = np.abs(beta*x - y).sum()
best_beta = beta
return best_obj, best_beta
def solve_bisect(x, y):
it = 0
u = 100
l = -100
while True:
it +=1
if it > 40:
# maxIter reached
return obj, beta, subgrad
# select the mid-point
beta = (l + u)/2
# subgrad calculation. \partial |x*beta - y| = sign(x*beta - y) * x. np.abs(x * beta -y) > 1e-5 is to avoid numerical issue
subgrad = (np.sign(x * beta - y) * (np.abs(x * beta - y) > 1e-5) * x).sum()
obj = np.sum(np.abs(x * beta - y))
print 'obj = %f, subgrad = %f, current beta = %f' % (obj, subgrad, beta)
# bisect. check subgrad to decide which part of the space to cut out
if np.abs(subgrad) <1e-3:
return obj, beta, subgrad
elif subgrad > 0:
u = beta + 1e-3
else:
l = beta - 1e-3
brute_sol = find_best(x,y)
bisect_sol = solve_bisect(x,y)
print 'brute_sol: obj = %f, beta = %f' % (brute_sol[0], brute_sol[1])
print 'bisect_sol: obj = %f, beta = %f, subgrad = %f' % (bisect_sol[0], bisect_sol[1], bisect_sol[2])
As you can see I also have a brute-force implementation that searches over the space to get the oracle answer (up to some numerical error). Every run can find the optimal best and minumum objective value. However, the subgrad is not 0 (not even close). For instance, one of my run I got the following:
brute_sol: obj = 10.974381, beta = -0.440700
bisect_sol: obj = 10.974374, beta = -0.440709, subgrad = 0.840753
Objective value and best are correct, but subgrad is not close to 0 at all. So the question:
- why the subgrad is not close to 0 ? Isn't the optimal condition being 0 is in the subdifferential if and only if it's optimal?
- What stopping criterion should we use instead?