As mentioned in the comments: the docs and the sources do not indicate that this is supported within sklearn!
I just tried the alternative of using off shelf convex-optimization solvers. It's just a simple prototype-like approach and it might not be a good fit for your (incompletely defined) task (sample-size?).
Some comments:
- implementation/model-formulation is easy
- the problem is harder to solve than i thought
- solver ECOS having general trouble
- solver SCS reaches good accuracy (worse compared to sklearn)
- but: tuning iterations to improve accuracy breaks the solver
- problem will be infeasible for SCS!
- SCS + bigM-based formulation (constraint is posted as penalization-term within objective) looks usable; but might need tuning
- only open-source solvers were tested and commercial ones might be much better
Further things to try:
- Tackling huge problems (where performance gets more important compared to robustness and accuracy), a (Accelerated) Projected Stochastic Gradient approach looks promising
Code
""" data """
from time import perf_counter as pc
import numpy as np
from sklearn import datasets
diabetes = datasets.load_diabetes()
A = diabetes.data
y = diabetes.target
alpha=0.1
print('Problem-size: ', A.shape)
def obj(x): # following sklearn's definition from user-guide!
return (1. / (2*A.shape[0])) * np.square(np.linalg.norm(A.dot(x) - y, 2)) + alpha * np.linalg.norm(x, 1)
""" sklearn """
print('\nsklearn classic l1')
from sklearn import linear_model
clf = linear_model.Lasso(alpha=alpha, fit_intercept=False)
t0 = pc()
clf.fit(A, y)
print('used (secs): ', pc() - t0)
print(obj(clf.coef_))
print('sum x: ', np.sum(clf.coef_))
""" cvxpy """
print('\ncvxpy + scs classic l1')
from cvxpy import *
x = Variable(A.shape[1])
objective = Minimize((1. / (2*A.shape[0])) * sum_squares(A*x - y) + alpha * norm(x, 1))
problem = Problem(objective, [])
t0 = pc()
problem.solve(solver=SCS, use_indirect=False, max_iters=10000, verbose=False)
print('used (secs): ', pc() - t0)
print(obj(x.value.flat))
print('sum x: ', np.sum(x.value.flat))
""" cvxpy -> sum x == 1 """
print('\ncvxpy + scs sum == 1 / 1st approach')
objective = Minimize((1. / (2*A.shape[0])) * sum_squares(A*x - y))
constraints = [sum(x) == 1]
problem = Problem(objective, constraints)
t0 = pc()
problem.solve(solver=SCS, use_indirect=False, max_iters=10000, verbose=False)
print('used (secs): ', pc() - t0)
print(obj(x.value.flat))
print('sum x: ', np.sum(x.value.flat))
""" cvxpy approach 2 -> sum x == 1 """
print('\ncvxpy + scs sum == 1 / 2nd approach')
M = 1e6
objective = Minimize((1. / (2*A.shape[0])) * sum_squares(A*x - y) + M*(sum(x) - 1))
constraints = [sum(x) == 1]
problem = Problem(objective, constraints)
t0 = pc()
problem.solve(solver=SCS, use_indirect=False, max_iters=10000, verbose=False)
print('used (secs): ', pc() - t0)
print(obj(x.value.flat))
print('sum x: ', np.sum(x.value.flat))
Output
Problem-size: (442, 10)
sklearn classic l1
used (secs): 0.001451024380348898
13201.3508496
sum x: 891.78869298
cvxpy + scs classic l1
used (secs): 0.011165673357417458
13203.6549995
sum x: 872.520510561
cvxpy + scs sum == 1 / 1st approach
used (secs): 0.15350853891775978
13400.1272148
sum x: -8.43795102327
cvxpy + scs sum == 1 / 2nd approach
used (secs): 0.012579569383536493
13397.2932976
sum x: 1.01207061047
Edit
Just for fun i implemented a slow non-optimized prototype solver using the approach of accelerated projected gradient (remarks in code!).
This one should scale much better for huge problems (as it's a first-order method), despite slow behaviour here (because not optimized). There should be a lot of potential!
Warning: might be seen as advanced numerical-optimization to some people :-)
Edit 2: I forgot to add the nonnegative-constraint on the projection (sum(x) == 1 makes not much sense if x can be nonnegative!). This makes the solving much harder (numerical-trouble) and it's obvious, that one of those fast special-purpose projections should be used (and i'm too lazy right now; i think n*log n algs are available). Again: this APG-solver is a prototype not ready for real tasks.
Code
""" accelerated pg -> sum x == 1 """
def solve_pg(A, b, momentum=0.9, maxiter=1000):
""" remarks:
algorithm: accelerated projected gradient
projection: proj on probability-simplex
-> naive and slow using cvxpy + ecos
line-search: armijo-rule along projection-arc (Bertsekas book)
-> suffers from slow projection
stopping-criterion: naive
gradient-calculation: precomputes AtA
-> not needed and not recommended for huge sparse data!
"""
M, N = A.shape
x = np.zeros(N)
AtA = A.T.dot(A)
Atb = A.T.dot(b)
stop_count = 0
# projection helper
x_ = Variable(N)
v_ = Parameter(N)
objective_ = Minimize(0.5 * square(norm(x_ - v_, 2)))
constraints_ = [sum(x_) == 1]
problem_ = Problem(objective_, constraints_)
def gradient(x):
return AtA.dot(x) - Atb
def obj(x):
return 0.5 * np.linalg.norm(A.dot(x) - b)**2
it = 0
while True:
grad = gradient(x)
# line search
alpha = 1
beta = 0.5
sigma=1e-2
old_obj = obj(x)
while True:
new_x = x - alpha * grad
new_obj = obj(new_x)
if old_obj - new_obj >= sigma * grad.dot(x - new_x):
break
else:
alpha *= beta
x_old = x[:]
x = x - alpha*grad
# projection
v_.value = x
problem_.solve()
x = np.array(x_.value.flat)
y = x + momentum * (x - x_old)
if np.abs(old_obj - obj(x)) < 1e-2:
stop_count += 1
else:
stop_count = 0
if stop_count == 3:
print('early-stopping @ it: ', it)
return x
it += 1
if it == maxiter:
return x
print('\n acc pg')
t0 = pc()
x = solve_pg(A, y)
print('used (secs): ', pc() - t0)
print(obj(x))
print('sum x: ', np.sum(x))
Output
acc pg
early-stopping @ it: 367
used (secs): 0.7714511330487027
13396.8642379
sum x: 1.00000000002