Scipy.optimize.minimize defining a constraint that limits the number of variables that have non zero values

Question

I am working on an optimization problem, using scipy minimize (method=SLSQP), that uses approx. 700 variables. I am setting minimum and maximum boundaries between 0 and 1.

My issue is that the problem requires only 400 variables to have values greater than 0 (the other 300 must be equal to zero). I do not know precisely which variables are those; otherwise, I would have excluded them. Therefore, I need to somehow create a constraint specifying this.

My problem is how to define the constraint. Any thoughts?

Update: Following a comment from Joni (many thanks). It seems that I need to formulate the problem as mixed-integer linear optimization. I am unfamiliar with it. Any pointers would be appreciated.

Below I provide more details on the intended optimization.

Problem: minimize the difference between the weights of an investor portfolio vs. the weights of a stock index benchmark (with 700 stocks) subject to several constraints (allocation to specific industries, for instance). The difficulty comes in specifying an additional constraint in which the investor only allocates positive weights to 400 stocks, while the remaining 300 must be equal to zero.

objective function --> min ((w -wbenchmark)^2).sum()

boundaries (0,1)

Constraints: --> industry allocations (no problem here)

--> limit number of stock weights with weights greater than zero to max 400 and the remaining 300 equal to zero

"that uses approx. 700 variables"?!? I'm be curious about the underlying problem (domain). 700 variables is not something I've come across: are you modelling Earth? — 9769953, Sep 23 '22 at 14:06
far from modeling earth, it is a portfolio optimization problem. Not uncommon to have that number of stocks, bonds, or a different type of financial assets. — Quipz, Sep 23 '22 at 14:17
Thanks. Definitely a new domain to me. It's mostly that I'm just thinking that the amount of correlation between fitted parameters can be horrendous with that many parameters. — 9769953, Sep 23 '22 at 14:19
To rephrase/verify your issue: there are 300 unknown parameters that are constrained to be fixed at zero, and the remaining (unknown) 400 parameters will have to be larger than zero? — 9769953, Sep 23 '22 at 14:20
Yes. Let us assume that you have 700 stocks to choose from and allocate weights (these are the parameters to optimize). However, you are limited to allocate positive weights to 400 of them. Therefore you have to force the solution make the weights of 300 stocks equal to zero. — Quipz, Sep 23 '22 at 14:40
This can easily be formulated with (additional) integer variables which in turn makes your optimization problem a mixed-integer nonlinear optimization problem (MINLP). However, it's worth mentioning that scipy.optimize can't solve MINLPs, only mixed-integer **linear** optimization problems. So it depends on the structure of your problem (quadratic, convex, etc) which solver is a reasonable choice. — joni, Sep 23 '22 at 15:13
Besides the mentioned discrete-optimization approaches, in some domains, especially machine-learning, people tackle this *heuristically* / non-exact by introducing a L1-norm on those (cardinality-constrained) variables and tuning the factor of this norm. (https://inst.eecs.berkeley.edu/~ee127/sp21/livebook/l_lqp_min_card.html) — sascha, Sep 23 '22 at 16:50

Scipy.optimize.minimize defining a constraint that limits the number of variables that have non zero values

0 Answers0