CVXPY returns Infeasible/Inaccurate on Quadratic Programming Optimization Problem

Question

I am trying to use CVXPY to solve a nonnegative least squares problem (with the additional constraint that the sum of entries in the solution vector must equal 1). However, when I run CVXPY on this simple quadratic program using the SCS solver, I let the solver run for a maximum of 100000 iterations, and I run into an error in the end saying that the quadratic program is infeasible/inaccurate. However, I am quite confident that the mathematical setup of the quadratic program is correct (such that the constraints of the problem should not be infeasible). Moreover, when I run this program on a smaller A matrix and smaller b vector (having only the first several rows), the solver runs fine. Here is a snapshot of my code and current error output:

x = cp.Variable(61)
prob = cp.Problem(cp.Minimize(cp.sum_squares(A @ x - b)), [x >= 0, cp.sum(x) == 1])
print('The problem is QP: ', prob.is_qp())
result = prob.solve(solver = cp.SCS, verbose = True, max_iters = 100000)
print("status: ", prob.status)
print("optimal value: ", prob.value)
print("cvxpy solution:")
print(x.value, np.sum(np.array(x)))

And below is the output:

The problem is QP:  True
----------------------------------------------------------------------------
    SCS v2.1.1 - Splitting Conic Solver
    (c) Brendan O'Donoghue, Stanford University, 2012
----------------------------------------------------------------------------
Lin-sys: sparse-direct, nnz in A = 2446306
eps = 1.00e-04, alpha = 1.50, max_iters = 100000, normalize = 1, scale = 1.00
acceleration_lookback = 10, rho_x = 1.00e-03
Variables n = 62, constraints m = 278545
Cones:  primal zero / dual free vars: 1
    linear vars: 61
    soc vars: 278483, soc blks: 1
Setup time: 1.96e+00s
----------------------------------------------------------------------------
 Iter | pri res | dua res | rel gap | pri obj | dua obj | kap/tau | time (s)
----------------------------------------------------------------------------
     0| 5.73e+18  1.10e+20  1.00e+00 -4.09e+22  1.55e+26  1.54e+26  1.47e-01 
   100| 8.00e+16  9.05e+18  1.79e-01  1.60e+25  2.29e+25  7.01e+24  8.82e+00 
   200| 6.83e+16  6.25e+17  8.10e-02  1.66e+25  1.95e+25  2.93e+24  1.81e+01 
   300| 6.62e+16  1.42e+18  1.00e-01  1.60e+25  1.96e+25  3.56e+24  2.61e+01 
   400| 9.49e+16  3.76e+18  1.98e-01  1.64e+25  2.45e+25  8.07e+24  3.39e+01 
   500| 6.24e+16  1.69e+19  3.12e-03  1.74e+25  1.75e+25  9.06e+22  4.20e+01 
   600| 8.47e+16  5.06e+18  1.42e-01  1.65e+25  2.20e+25  5.47e+24  5.04e+01 
   700| 8.57e+16  4.41e+18  1.55e-01  1.64e+25  2.25e+25  6.04e+24  5.79e+01 
   800| 6.03e+16  1.60e+18  1.30e-02  1.72e+25  1.77e+25  4.50e+23  6.51e+01 
   900| 7.35e+17  2.84e+20  3.21e-01  1.56e+25  3.04e+25  1.46e+25  7.21e+01 
  1000| 9.11e+16  1.38e+19  1.40e-01  1.68e+25  2.23e+25  5.46e+24  7.92e+01 
.........many more iterations in between omitted........
99500| 2.34e+15  1.56e+17  1.61e-01  3.30e+24  4.57e+24  1.27e+24  5.32e+03 
 99600| 1.46e+18  1.10e+21  9.12e-01  2.56e+24  5.55e+25  5.26e+25  5.33e+03 
 99700| 1.94e+15  1.41e+16  8.97e-02  3.40e+24  4.07e+24  6.71e+23  5.34e+03 
 99800| 3.31e+17  6.68e+20  5.62e-01  2.71e+24  9.67e+24  6.91e+24  5.35e+03 
 99900| 2.51e+15  9.96e+17  1.03e-01  3.41e+24  4.19e+24  7.81e+23  5.35e+03 
100000| 2.21e+15  3.79e+17  1.40e-01  3.29e+24  4.37e+24  1.07e+24  5.36e+03 
----------------------------------------------------------------------------
Status: Infeasible/Inaccurate
Hit max_iters, solution may be inaccurate
Timing: Solve time: 5.36e+03s
    Lin-sys: nnz in L factor: 2726743, avg solve time: 1.60e-02s
    Cones: avg projection time: 7.99e-04s
    Acceleration: avg step time: 2.72e-02s
----------------------------------------------------------------------------
Certificate of primal infeasibility:
dist(y, K*) = 0.0000e+00
|A'y|_2 * |b|_2 = 2.5778e-01
b'y = -1.0000
============================================================================
status:  infeasible_inaccurate
optimal value:  None
cvxpy solution:
None var59256

Does anyone know why: (1) CVXPY could be failing to solve my problem? Is it just a matter of more solver iterations being needed? And if so, how many? and (2) What do the column names "pri res", "dua res", "rel gap", etc. mean? I am trying to follow the values in these columns to understand what is happening during the solve process.

Also, for those who would like to follow along with the dataset that I'm working with, here is a CSV file holding my A matrix (dimensions 278481x61) and here is a CSV file holding my b vector (dimensions 278481x1). Thanks in advance!

sascha · Answer 1 · 2021-01-01T01:43:17.907

Some remarks:

Intro

Question: Reproducible code

Question-setup is lazy: you could have added imports and csv-reading instead of making us replicate it...
- Easy to do experiments on different data now due to different parsing...

Task + Approach

This looks like a combination of:
- general-purpose math-opt solver
- machine-learning scale data
= often hits limits!

Solvers: Expectations

SCS is first-order based and expected to be less robust compared to ECOS or other high-order methods

Your Model

Observations

Solver: SCS (default opts)
- Fails / Runs havoc according to the progress of residuals (probably due no numerical-issues)
  - Looks more crazy on my side
Solver: ECOS (default opts)
- Fails (primal infeasible due to numerical-issues)

Analysis

You won't be able to solve these numerical-issues by increasing iteration-limits only
- More complex tuning of feasibility-tolerances and co. would be needed!

Transformation / Fixing

Can we make those solvers work? I think so.

Instead of minimizing sum of squares, let's minimize the l2-norm instead. This is equivalent in regards to our solution and we might square the objective-value if we are interested in this value anyway!

This is motivated by this:

One particular reformulation that we strongly encourage is to eliminate quadratic forms—that is, functions like sum_square, sum(square(.)) or quad_form—whenever it is possible to construct equivalent models using norm instead. Our experience tells us that quadratic forms often pose a numerical challenge for the underlying solvers that CVX uses.

We acknowledge that this advice goes against conventional wisdom: quadratic forms are the prototypical smooth convex function, while norms are nonsmooth and therefore unwieldy. But with the conic solvers that CVX uses, this wisdom is exactly backwards. It is the norm that is best suited for conic formulation and solution. Quadratic forms are handled by converting them to a conic form—using norms, in fact! This conversion process poses some interesting scaling challenges. It is better if the modeler can eliminate the need to perform this conversion.

Code

import pandas as pd
import cvxpy as cp
import numpy as np

A = pd.read_csv('A_matrix.csv').to_numpy()
b = pd.read_csv('b_vector.csv').to_numpy().ravel()

x = cp.Variable(61)
prob = cp.Problem(cp.Minimize(cp.norm(A @ x - b)), [x >= 0, cp.sum(x) == 1])
result = prob.solve(solver = cp.SCS, verbose = True)

print("optimal value: ", prob.value)
print("cvxpy solution:")
print(x.value, np.sum(x.value))

Output solver=cp.SCS (slow CPU)

Valid solver-state + slow + solution looks not robust enough -> fluctuating around 0 symmetrically -> large primal-feas error in regards to x=>0

Could probably be improved by tunings, but it's probably better to use a different solver here! Not much analysis done here.

----------------------------------------------------------------------------
        SCS v2.1.2 - Splitting Conic Solver
        (c) Brendan O'Donoghue, Stanford University, 2012
----------------------------------------------------------------------------
Lin-sys: sparse-direct, nnz in A = 2446295
eps = 1.00e-04, alpha = 1.50, max_iters = 5000, normalize = 1, scale = 1.00
acceleration_lookback = 10, rho_x = 1.00e-03
Variables n = 62, constraints m = 278543
Cones:  primal zero / dual free vars: 1
        linear vars: 61
        soc vars: 278481, soc blks: 1
Setup time: 1.63e+00s
----------------------------------------------------------------------------
Iter | pri res | dua res | rel gap | pri obj | dua obj | kap/tau | time (s)
----------------------------------------------------------------------------
    0| 9.14e+18  1.39e+20  1.00e+00 -5.16e+20  6.20e+23  6.04e+23  1.28e-01
  100| 8.63e-01  1.90e+02  7.79e-01  5.96e+02  4.81e+03  1.17e-14  1.04e+01
  200| 5.09e-02  3.50e+02  1.00e+00  6.20e+03 -1.16e+02  5.88e-15  2.08e+01
  300| 3.00e-01  3.71e+03  7.64e-01  9.62e+03  7.19e+04  4.05e-15  3.17e+01
  400| 5.19e-02  1.76e+02  1.91e-01  4.71e+03  6.94e+03  3.87e-15  4.25e+01
  500| 4.60e-02  2.66e+02  2.83e-01  5.70e+03  1.02e+04  6.48e-15  5.25e+01
  600| 5.13e-03  1.08e+02  1.24e-01  5.80e+03  7.44e+03  1.72e-14  6.23e+01
  700| 3.35e-03  6.81e+01  9.64e-02  5.39e+03  4.44e+03  5.94e-15  7.15e+01
  800| 1.62e-02  8.52e+01  1.17e-01  5.51e+03  6.97e+03  3.96e-15  8.07e+01
  900| 1.93e-02  1.57e+01  1.89e-02  5.58e+03  5.38e+03  5.04e-15  8.98e+01
  1000| 6.94e-03  6.85e+00  7.97e-03  5.57e+03  5.48e+03  1.75e-15  9.91e+01
  1100| 4.64e-03  7.64e+00  1.42e-02  5.66e+03  5.50e+03  1.91e-15  1.09e+02
  1200| 2.25e-04  3.25e-01  4.00e-04  5.61e+03  5.60e+03  5.33e-15  1.18e+02
  1300| 4.73e-05  9.05e-02  5.78e-05  5.60e+03  5.60e+03  6.16e-15  1.28e+02
  1400| 6.27e-07  4.58e-03  3.22e-06  5.60e+03  5.60e+03  7.17e-15  1.36e+02
  1500| 2.02e-07  5.27e-05  4.58e-08  5.60e+03  5.60e+03  5.61e-15  1.46e+02
----------------------------------------------------------------------------
Status: Solved
Timing: Solve time: 1.46e+02s
        Lin-sys: nnz in L factor: 2726730, avg solve time: 2.54e-02s
        Cones: avg projection time: 1.16e-03s
        Acceleration: avg step time: 5.61e-02s
----------------------------------------------------------------------------
Error metrics:
dist(s, K) = 7.7307e-12, dist(y, K*) = 0.0000e+00, s'y/|s||y| = 3.0820e-18
primal res: |Ax + s - b|_2 / (1 + |b|_2) = 2.0159e-07
dual res:   |A'y + c|_2 / (1 + |c|_2) = 5.2702e-05
rel gap:    |c'x + b'y| / (1 + |c'x| + |b'y|) = 4.5764e-08
----------------------------------------------------------------------------
c'x = 5602.9367, -b'y = 5602.9362
============================================================================
optimal value:  5602.936687635506
cvxpy solution:
[ 1.33143619e-06 -5.20173272e-07 -5.63980428e-08 -9.44340768e-08
  6.07765135e-07  7.55998810e-07  8.45038786e-07  2.65626921e-06
-1.35669263e-07 -4.88286704e-07 -1.09285233e-06  8.63799377e-07
  2.85145903e-07 -1.22240651e-06  2.14526505e-07 -2.40179173e-06
-1.75042884e-07 -1.27680170e-06 -1.40486649e-06 -1.12113037e-06
-2.26601198e-07  1.39878723e-07 -3.19396803e-06 -6.36480154e-07
  2.16005860e-05  1.18205616e-06  2.15620316e-06 -1.94093348e-07
-1.88356275e-06 -7.07687270e-06 -1.99902966e-06 -2.28894738e-06
  1.00000188e+00 -9.95601469e-07 -1.26333877e-06  1.26336565e-06
-5.31474195e-08 -9.81111443e-07  2.22755569e-07 -7.49418940e-07
-4.77882668e-07  6.89785690e-07 -2.46822613e-06 -5.73596077e-08
  5.99307819e-07 -2.57301316e-07 -7.59150986e-07 -1.23753681e-08
-1.39938273e-06  1.48526305e-06 -2.41075790e-06 -3.50224485e-07
  1.79214177e-08  6.71852182e-07 -5.10880844e-06  2.44821668e-07
-2.88655782e-06 -2.45457029e-07 -4.97712502e-07 -1.44497848e-06
-2.22294748e-07] 0.9999895863519757

Output solver=cp.ECOS (slow CPU)

Valid solver-state + much faster + solution looks ok

ECOS 2.0.7 - (C) embotech GmbH, Zurich Switzerland, 2012-15. Web: www.embotech.com/ECOS

It     pcost       dcost      gap   pres   dres    k/t    mu     step   sigma     IR    |   BT
0  +0.000e+00  -0.000e+00  +7e+04  9e-01  1e-04  1e+00  1e+03    ---    ---    1  2  - |  -  -
1  -5.108e+01  -4.292e+01  +7e+04  6e-01  1e-04  9e+00  1e+03  0.0218  3e-01   4  5  4 |  0  0
2  +2.187e+02  +2.387e+02  +5e+04  6e-01  8e-05  2e+01  8e+02  0.3427  4e-02   4  5  5 |  0  0
3  +1.109e+03  +1.118e+03  +1e+04  3e-01  2e-05  9e+00  2e+02  0.7403  6e-03   4  5  5 |  0  0
4  +1.873e+03  +1.888e+03  +1e+04  2e-01  2e-05  1e+01  2e+02  0.2332  1e-01   5  6  6 |  0  0
5  +3.534e+03  +3.565e+03  +4e+03  8e-02  8e-06  3e+01  7e+01  0.7060  2e-01   5  6  6 |  0  0
6  +5.452e+03  +5.453e+03  +2e+02  2e-03  2e-07  1e+00  3e+00  0.9752  2e-03   6  8  8 |  0  0
7  +5.584e+03  +5.585e+03  +4e+01  4e-04  4e-08  4e-01  7e-01  0.8069  6e-02   2  2  2 |  0  0
8  +5.602e+03  +5.602e+03  +5e+00  5e-05  6e-09  8e-02  9e-02  0.9250  5e-02   2  2  2 |  0  0
9  +5.603e+03  +5.603e+03  +1e-01  1e-06  1e-10  2e-03  2e-03  0.9798  2e-03   5  5  5 |  0  0
10  +5.603e+03  +5.603e+03  +6e-03  4e-07  6e-12  9e-05  1e-04  0.9498  3e-04   5  5  5 |  0  0
11  +5.603e+03  +5.603e+03  +4e-04  4e-07  3e-13  7e-06  6e-06  0.9890  4e-02   1  2  2 |  0  0
12  +5.603e+03  +5.603e+03  +1e-05  4e-08  8e-15  2e-07  2e-07  0.9816  8e-03   1  2  2 |  0  0
13  +5.603e+03  +5.603e+03  +2e-07  7e-10  1e-16  2e-09  2e-09  0.9890  1e-04   5  3  4 |  0  0

OPTIMAL (within feastol=7.0e-10, reltol=2.7e-11, abstol=1.5e-07).
Runtime: 18.727676 seconds.

optimal value:  5602.936884707248
cvxpy solution:
[7.47985848e-11 3.58238148e-11 4.53994101e-11 3.73056632e-11
3.47224797e-11 3.62895261e-11 3.59367993e-11 4.03642466e-11
3.58643375e-11 3.24886989e-11 3.25080912e-11 3.34866983e-11
3.66296670e-11 3.89612422e-11 3.54489152e-11 7.07301971e-11
3.95949165e-11 3.68235605e-11 3.05918372e-11 3.43890675e-11
3.71817538e-11 3.62561876e-11 3.55281653e-11 3.55800928e-11
4.10876077e-11 4.12877804e-11 4.11174782e-11 3.35519296e-11
3.43716575e-11 3.56588133e-11 3.66118962e-11 3.68789703e-11
9.99999998e-01 3.34857869e-11 3.21984616e-11 5.82577263e-11
2.85751155e-11 3.64710243e-11 3.59930485e-11 5.04742702e-11
3.07026084e-11 3.36507487e-11 4.19786324e-11 8.35032700e-11
3.33575857e-11 3.42732986e-11 3.70599423e-11 4.73856413e-11
3.39708564e-11 3.64354428e-11 2.95022064e-11 3.46315519e-11
3.04124702e-11 4.07870093e-11 3.57782184e-11 3.71824186e-11
3.72394185e-11 4.48194963e-11 4.09635820e-11 6.45638394e-11
4.00297122e-11] 0.9999999999673748

Outro

Final remarks

Above re-formulation might be enough to help you solve your problem
ECOS beating SCS on this large-scale problem is unintuitive, but can alway happen and i won't analyze it (SCS is still a great solver, but ECOS is too despite both being very different approaches! Open-source community should be happy to have those!)
I don't think i would go for these solvers with ML-scale data if i got time to implement something more customized
- The easiest approach which comes to mind for large-scale solving here:
  - Projected (Accelerated) Gradient (which would be a first-order method being very robust in regards to your constraints here!)
    - "projection onto the probability simplex" (which you need) is a common (well-researched) thing
Going by the final weights/coefficients: data looks strange!
- There seems to be a very dominating column (information-leakage); i don't know
- Rounding, both solvers will output the solution vector: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]

Mossk is likely to be much faster than ECOS but is of course closed source. — ErlingMOSEK, Jan 03 '21 at 15:20