1

While trying to solve a logistic regression problem using cvxpy I got a bunch of terminal outputs when calling the solve() function even though no print-outs were programmed. Furthermore, no information about the problem was printed to the terminal even though verbose was set to true and the optimal value could not be accessed.

I guess I'm doing something wrong in the problem formulation but can't quite figure out what it is.

The problem was defined as follows in the minimal code example:

import numpy as np
import cvxpy as cp

y_vec = np.random.choice([0, 1], size=(728,), p=[9./10, 1./10])
M_mat = np.random.choice([0, 1], size=(728,801), p=[9./10, 1./10])
beta = cp.Variable(M_mat.shape[0])
objective = 0
for i in range(400):
    objective += y_vec[i] * M_mat[:, i].T @ beta - \
        cp.log(1 + cp.exp((M_mat[:, i].T @ beta)))

prob = cp.Problem(cp.Maximize(objective))
prob.solve(verbose=True)
print("Optimal var reached",  beta.value)

Both y_vec and M_mat are numpy arrays with data type int64. Both are selection matrices for the classification problem consisting of only 0 and 1. For the purpose of the minimal code example they are randomly generated to reproduce the error. Furthermore M_mat[:, i].T @ beta was checked to result in a scalar as intended.

When i execute the code i get printouts a lot of printouts like these with the program terminating after a certain number.

End of the terminal print out after which the program is terminated

Shown here is only the end of the print outs when the program terminates. But there are many blocks of the form log(1.0 + exp([ 0. 0. ...... 0.] * var0)) where this output sequence is of the same length as the variable beta.

I find this result quite confusing. How can i arrive at a single vector for the optimization argument beta? Any help is much appreciated!

Manumerous
  • 455
  • 6
  • 21

2 Answers2

2

Your original formulation of the problem is not DCP-compliant. I admit the output is far from ideal; CVXPY is printing the expressions that are not DCP-compliant, but there are so many of them that the output is useless. Replacing the 400 in your for loop with 1, you get the following output, which is more helpful because it fits in your terminal window.

The objective is not DCP. Its following subexpressions are not:
log(1.0 + exp([0. 0. 1. 0. 0. 1. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 1. 0. 0. 0. 0. 0. 1. 0. 1. 0. 1. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0.
 1. 0. 0. 0. 0. 1. 0. 0. 1. 0. 0. 0. 0. 0. 0. 1. 1. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 1. 0. 0. 0. 0. 0. 0. 1. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 1. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 1. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0.
 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 1. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 1. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 1. 1. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 1. 0. 0.
 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 1. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 1. 0. 1. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 1. 0. 1. 1. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 1.
 0. 0. 0. 1. 0. 0. 1. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 1. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0.
 0. 1. 0. 0. 1. 0. 1. 0. 0. 1. 1. 0. 1. 0. 0. 0. 1. 0. 0. 0. 1. 0. 1. 0.
 0. 0. 1. 0. 0. 0. 0. 1. 0. 0. 1. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0. 0.
 0. 0. 0. 0. 1. 1. 0. 0.] @ var6400))

Akshay Agrawal
  • 912
  • 5
  • 7
1

After some trial and error i found out that using the cvxpy.logistic()function somehow results in a successful computation of the solution with a desired output vector.

This was achieved by reformulating the objective function as follows:

objective = 0
for i in range(400):
    objective += y_vec[i] * M_mat[:, i].T @ beta - cp.logistic(M_mat[:, i].T @ beta)

Even though both implementations should mathematically be the same according to Atomic Functions - CVXPY it results in drastically different outputs. Why this is the case i don't know. I hope the solution might nonetheless be useful for somebody and I'm curious to know more why the behavior is so different if someone knows more.

Manumerous
  • 455
  • 6
  • 21
  • it tells me i can only accept my own answer tomorrow. So i will do that tomorrow :) – Manumerous May 27 '20 at 12:34
  • 1
    Excellent. Thank you for following up and posting it (many people never bother). – msanford May 27 '20 at 12:35
  • Your welcome! May i ask if you think it might be worth it to do a follow up question to ask whether someone knows why this two different behaviors exist? Even if i can continue my work now i guess this is hardly the way its intended in cvxpy. Maybe it might even be worth a bug report? – Manumerous May 27 '20 at 12:41
  • 2
    CVXPY knows the curvature of the logistic function is convex, as mentioned in the atomic functions documentation. It doesn't know that cp.log(1+cp.exp(x)) is convex, because that expression doesn't follow DCP rules. – Akshay Agrawal May 27 '20 at 21:15