Linear programming vs. Iteratively reweighted least squares running time

Question

I was trying to run a Least Absolute Deviance regression (L1 regression). I did it by designing a linear program, and solving with CVXOPT (in Python). The number of features (including constant) was 13 and the number of samples was ~100K. It took many hours to run to completion. Then I realized that I can run statsmodels' QuantReg with q=0.5. it took a few seconds. I read that they use Iteratively reweighted least squares (IRLS). Is IRLS faster than linear programming for this purpose, or maybe a different formulation of the linear optimization problem, and a use on a better solver like Gurobi, would reduce the running time of the linear program substantially? The reason I am asking is because I want to try a loss function which involves a summation of two different absolute functions, with different weights. I believe that I can employ IRLS for that problem as well, but I am not sure, and I am more comfortable with linear programming.

The formulation I used for the linear program was as follows:

enter image description here

The resulting code for CVXOPT was as follows:

def l1_fit(x,y):
'''

:param x: feature matrix
:param y: label vector
:return: LAD coefficients

The problem I construct is:
    min c'*(v|A)
    subject to:
        (-I|x)*(v|A)<=y
        (-I|-x)*(v|A)<=-y
'''
c = matrix(np.concatenate((np.ones(y.shape[0]),np.zeros(x.shape[1]))).astype(float).tolist(),
           (y.shape[0]+x.shape[1],1),'d')
G = scipy_sparse_to_spmatrix(np.transpose(hstack((vstack((-identity(y.shape[0]),np.transpose(x))),
                          vstack((-identity(y.shape[0]),np.transpose(-x))))).astype(float)))
h = matrix(np.concatenate((y,-y)).astype(float).tolist(),(2*y.shape[0],1),'d')
sol = lp(c, G, h)
return sol[1][y.shape[0]:]

There are different LP formulations for the L1 regression problem. For large problems, it may be beneficial not to duplicate the X matrix. See [link](http://yetanothermathprogrammingconsultant.blogspot.com/2017/11/lp-and-lad-regression.html) — Erwin Kalvelagen, Oct 19 '20 at 14:51
Koenker used an interior point algorithm. Developers for quantile regression in Julia reported a speedup compared to the IRLS version, the latter was originally a translation of the statsmodels version. https://github.com/pkofod/QuantileRegressions.jl — Josef, Oct 20 '20 at 01:41
The statsmodels version seems to do pretty well to get close to the optimum, but sometimes has problems settling down to the final result. — Josef, Oct 20 '20 at 01:45
+1 for statsmodels QuantReg. Yes, LP solver times can vary a LOT, see this [LP comparison matrix](https://scicomp.stackexchange.com/questions/34049/should-benchmarkings-be-done-at-all-what-is-the-point/36978#36978) of two dozen test cases on scicomp.stack. — denis, May 07 '21 at 08:18

Linear programming vs. Iteratively reweighted least squares running time

0 Answers0