2

I have a system of linear equations Ax=b, A - matrix (m x n), x - (n x 1), b - (m x 1). The matrix A is almost all 0 and contains only ~ n*m^(1/2) non-zero elements. m and n can be quite large, such as n=28680, m=22500 or n=28680, m=62500. It is obvious that the exact solution of the system with a high probability simply does not exist. Therefore, I need to find such a vector x0 at which the minimum of the sum( (Ax-b)^2 ) function is realized. At the same time, it is highly desirable that the vector x0 consists only of 0 and 1. Thus, I need to solve the problem sum( (Ax-b)^2 ) -> min, x from {0,1}

How did you try to solve the problem? First, I solved a simplified problem. sum( (Ax-b)^2 ) -> min, x from real The problem was quickly solved using tensoflow methods. I rounded up the resulting solution, but this resulted in very poor quality.

Then I tried to solve complete problem. The CVXPY library was suggested to me. By researching the official CVXPY website found out that it would suit me to use solver for the Mixed-integer quadratic class. I wrote a program using the SCIP solver. Since it would be too long to wait for the optimal solution, I limited the solution time to 200 seconds. The program works, but only for small m and n, for example n=4950, m=1600. If you increase m and n further, then SCIP gives a simple zero vector as a solution, although practically any vector with even one 1 is closer to the optimum. Here is the program code:

import cvxpy as cp
import numpy as np

#I tried to simulate my inputs for you. 
#Here the problem manifests itself already at m, n = 1600, 4950.
np.random.seed(0)
m, n= 1600, 4950
A = np.round(np.random.rand(m, n)/(1-1/m**(1/2))/2)*255*(m**(1/2)/1800)**1.1
b = 255*(np.random.randn(m))

x = cp.Variable(n, integer=True)
objective = cp.Minimize(cp.sum_squares(A @ x - b))
constraints = []
constraints.extend([x >= 0.0, x <= 1])
prob = cp.Problem(objective,constraints)
prob.solve(solver=cp.SCIP,verbose=True,scip_params={"limits/time": 50})
print("Status: ", prob.status)
print("The optimal value is", prob.value)
print("A solution x is")
print(x.value)
print("Non-zero ",sum([i for i in x.value]))

My questions are the following:

  1. Are there better options for solving my problem?
  2. Maybe I should use some other solver? I couldn't find another free solver for my problem.
  3. If SCIP is the best option, how can I make it work for large m and n? I don't understand why it stops working...

Update: I thought it would be better to avoid functions with a discontinuous derivative, but "sum of absolute errors" seems to work better than sum of squares, at least in my example. As for the matrix A, it is ok. And, unfortunately, there is no exact solution for my problem. However, matrix A is still not noise like in the example and, I hope, can provide a solution close enough to the exact one. At the same time, I am 100% sure that the zero vector is extremely far from the optimal solution, since I checked some vectors consisting of zeros with one "1" and ~ 80% of them were closer to the optimal one. AirSquid CVXPY code works for me even with matrices up to 27000x23000 and without an exact solution! Further I'm just running out of RAM... On my real data, the option with the sum of absolute errors worked - no longer returns a null vector. But there is another problem - too many 1... Now sum(abs(Ax-b)) is less with 0 vector than with the solution that solver gives me! I don't understand why this is happening.

Alex
  • 23
  • 4
  • 1
    Is it important in your application that you use the sum of squared error as a quality measure? If you can accept sum of absolute error, this can be made linear and will (likely) be much more performant. – AirSquid Apr 16 '23 at 16:25
  • 1
    Also: how important is it that `x` is integral? There are much, much faster least-squares linear solvers if this is made continuous. – Reinderien Apr 16 '23 at 18:36
  • @Reinderien As it turned out - very important. I wrote that I tried to use a continuous solution and the result did not suit me. – Alex Apr 16 '23 at 23:30
  • Re: update: using abs error enables the model to be processed as linear, which delivers many benefits over quadratic, if you don't need SSE. Derivatives are not used. The solution given in the examples below **is** the optimal solution. The zero vector is the most likely solution *for the example data you created, which is malformed*. Regarding your questioning of the solution for the real data, it is impossible to troubleshoot without the code used to solve and your QA of the solution. If you can provide an exemplar matrix in table or csv format that can demonstrate you are getting... – AirSquid Apr 17 '23 at 00:24
  • a non-optimal solution when the solver reports "optimal" (doubtful), that would be interesting to see. – AirSquid Apr 17 '23 at 00:25
  • @AirSquid i am sorry for being inaccurate, solver doesnt report "optimal", because i stop it after 100 sec. I used just 4900x9730 matrix and i expected to get a more optimal solution during this time. Once again, where I see the problem: the resulting solution is VERY bad, it is worse than the zero vector. You wrote that with my matrix it's normal to get a lot of zeros in the solution. But in 100 seconds the solver gave me a vector with 7000 "1" out of 9730. – Alex Apr 17 '23 at 00:50
  • Are you that time critical? My pyomo example below solved to optimality with full size problem in 150 seconds. Depending on what pre-processing the solver has to do, 100 seconds can be a very short cutoff. – AirSquid Apr 17 '23 at 00:58
  • @AirSquid I'm doing a rather useless thing for my own amusement, so it would be annoying to wait half a day for a solution :D I just installed pyomo and CBC, I will check how it works. But from experience with cvxpy with my data, the task is solved much longer than with the "good" data from your examples. – Alex Apr 17 '23 at 01:06
  • @AirSquid I have tested pyomo with CBC and it turned out much better than anything before! In 150 seconds on a small matrix, I got something like a good result. – Alex Apr 17 '23 at 01:12
  • @AirSquid I think my issue has been resolved, thanks! – Alex Apr 17 '23 at 01:16

1 Answers1

1

I ran with the assumption that "sum of absolute error" is good enough as a quality measure of the solution instead of SSE. By doing so, the model can be easily linearized and I think much more tractable by MIP solver rather than a quadratic solver. Note that both the examples below assume there is a good (actually exact in this case) solution. If the matrices are basically "noise" and you are looking for a best fit, then the solvers will almost certainly struggle a lot more as the problem is likely more degenerate.

Before looking at the solve, I think there are issues with your construction of A above. Make a small m x n matrix and see if it is what you expect. I get a lot of duplicate, all positive values, so it is completely possible that "all zeros" for x vector is the optimal solution for a b vector that should average ~0. I changed things around a bit with pos/neg values and the construction to get the density you want.

If you take my cvxpy shell and switch it back to SSE, I'd be curious to see what kind of performance you get vs. linear. I couldn't get an appropriate solver working for that as I don't use cvxpy as much.

cvxpy is kind of a natural for problems like this that can easily be expressed in matrix format. I'm jumping through a few hoops in pyomo to get to the same point before the solve, but it works quite well. On my 8 yr. old machine, pyomo takes about 200 sec to construct the maximally sized problem (as you see in results)

my cvxpy solution below works fine for smaller instances but barfs "infeasible" for matrices of 20k x 20k or such, so there is maybe something internal there that I don't understand, as it is clearly feasible. Perhaps somebody with a better knowledge of cvxpy can tinker and see what is up.

CVXPY for small matrix:

import cvxpy as cp
import numpy as np
from time import time

#I tried to simulate my inputs for you. 
#Here the problem manifests itself already at m, n = 1600, 4950.
np.random.seed(0)
m, n= 5_000, 5_000  # 20_000, 20_000  <--- barfs
A_base = (np.random.randint(-1000, 1000, (m, n)))
marks = np.random.binomial(n=1, p=(n*m)**0.5/(m*n), size=(m, n))
A = np.where(marks, A_base, 0)
true_x = np.random.binomial(n=1, p=0.6, size=n)
b = A @ true_x

tic = time()
x = cp.Variable(n, integer=True)
objective = cp.Minimize(cp.sum(cp.abs(A @ x - b)))
constraints = []
constraints.extend([x >= 0.0, x <= 1])
prob = cp.Problem(objective,constraints)
res=prob.solve() #solver=cp.SCIP,verbose=True,scip_params={"limits/time": 50})
toc = time()
print("Status: ", prob.status)
print("The optimal value is", prob.value)
print("A solution x is")
print(x.value)
print("Non-zero ",sum([i for i in x.value if i is not None ]))
print(f'solver time:  {toc-tic : 0.1f} sec.')

CVXPY output:

Status:  optimal
The optimal value is 0.0
A solution x is
[ 1.  1.  1. ... -0.  1. -0.]
Non-zero  1907.0
solver time:   3.5 sec.
[Finished in 5.5s]

PYOMO for full-size matrix

import numpy as np
import pyomo.environ as pyo
from time import time
import sys

np.random.seed(0)
m, n = 28_680, 62_500
A_base = (np.random.randint(-1000, 1000, (m, n)))
marks = np.random.binomial(n=1, p=(n*m)**0.5/(m*n), size=(m, n))
A = np.where(marks, A_base, 0)
true_x = np.random.binomial(n=1, p=0.6, size=n)
b = A @ true_x
# b = np.random.randint(-100, 100, n)
# print(b)

tic = time()
model = pyo.ConcreteModel()

model.N = pyo.Set(initialize=range(n))
model.M = pyo.Set(initialize=range(m))

# initializing x only because some values of x[n] may end up unconstrained and this prevents "None" in result
model.x           = pyo.Var(model.N, within=pyo.Binary, initialize=0)
model.row_err     = pyo.Var(model.M)
model.row_abs_err = pyo.Var(model.M)

# constrain the rowsum/error appropriately
@model.Constraint(model.M)
def rowsum(model, m):
    return model.row_err[m] == sum(A[m,n] * model.x[n] for n in model.N if A[m,n]) - b[m]


# constrain the abs error
@model.Constraint(model.M)
def abs_1(model, m):
    return model.row_abs_err[m] >= model.row_err[m]

@model.Constraint(model.M)
def abs_2(model, m):
    return model.row_abs_err[m] >= -model.row_err[m]

# minimize sum of absolute row errors
model.obj = pyo.Objective(expr=sum(model.row_abs_err[m] for m in model.M))
toc = time()
print(f'starting solver...  at time: {toc-tic: 0.1f} sec.')
solver = pyo.SolverFactory('cbc')
res = solver.solve(model)
toc = time()

print(f'total solve time:  {toc-tic: 0.1f} sec')
print(res)
# model.x.display()

x = np.array([pyo.value(model.x[n]) for n in model.N], dtype=int)

print(f'verifying solution:  {np.sum(A@x - b)}')

PYOMO output:

starting solver...  at time:  228.2 sec.
solver time:   384.8 sec

Problem: 
- Name: unknown
  Lower bound: 0.0
  Upper bound: 0.0
  Number of objectives: 1
  Number of constraints: 40941
  Number of variables: 50488
  Number of binary variables: 30839
  Number of integer variables: 30839
  Number of nonzeros: 20949
  Sense: minimize
Solver: 
- Status: ok
  User time: -1.0
  System time: 123.18
  Wallclock time: 152.91
  Termination condition: optimal
  Termination message: Model was solved to optimality (subject to tolerances), and an optimal solution is available.
  Statistics: 
    Branch and bound: 
      Number of bounded subproblems: 21
      Number of created subproblems: 21
    Black box: 
      Number of iterations: 2668
  Error rc: 0
  Time: 153.19930219650269
Solution: 
- number of solutions: 0
  number of solutions displayed: 0

verifying solution:  0
AirSquid
  • 10,214
  • 2
  • 7
  • 31
  • Thank you for your response! My answer doesn't fit in a comment so I've updated the question. In short, the opposite problem appeared. – Alex Apr 16 '23 at 23:28