Optimization with discrete parameters in Matlab

Question

I have 12 sets of vectors (about 10-20 vectors each) and i want to pick one vector of each set so that a function f that takes the sum of these vectors as argument is maximized. In addition i have constraints for some components of that sum.

Example:

a_1 = [3 2 0 5], a_2 = [3 0 0 2], a_3 = [6 0 1 1], ... , a_20 = [2 12 4 3]
b_1 = [4 0 4 -2], b_2 = [0 0 1 0], b_3 = [2 0 0 4], ... , b_16 = [0 9 2 3]
...
l_1 = [4 0 2 0], l_2 = [0 1 -2 0], l_3 = [4 4 0 1], ... , l_19 = [3 0 9 0]

s = [s_1 s_2 s_3 s_4] = a_x + b_y + ... + l_z

Constraints:

s_1 > 40
s_2 < 100
s_4 > -20

Target: Chose x, y, ... , z to maximize f(s):

f(s) -> max

Where f is a nonlinear function that takes the vector s and returns a scalar.

Bruteforcing takes too long because there are about 5.9 trillion combinations, and since i need the maximum (or even better the top 10 combinations) i can not use any of the greedy algorithms that came to my mind.

The vectors are quite sparse, about 70-90% are zeros. If that is helping somehow ...?

The Matlab Optimization toolbox didnt help either since it doesnt much support for discrete optimization.

can you tell something about the non-linear function `f(s)`? what do you know about it? what can you assume? — Shai, Jun 25 '13 at 20:38
If you can supply a fully reproducible example (with the constraints and the obj function detailed out) we can suggest ways to go about solving the problem. — Ram Narasimhan, Jun 25 '13 at 21:47
@natan how `bintprog` is going to help? the objective function of `bintprog` is assumed **linear** — Shai, Jun 25 '13 at 22:08
I wrote also "or fminsearch", given more data I'd probably give a better comment... — bla, Jun 25 '13 at 22:11
About the function f(s): I will need different functions and im not 100% sure what they look like, but here are some examples: f(s) = s_1 * (s_2 * s_3 - s_2 + 2) or f(s) = s_1 * s_2 * 0.5 / (1.5 * s_3). Constraints are like in my example, only greater/smaller. — Johannes, Jun 25 '13 at 23:09
does `a_x` means that x is some number from 1 to 20, relating to a_1...a_20 ? What does `s1=` means ? just one of four vectors that are sampled from the set `a_i ... l_j`? — bla, Jun 26 '13 at 02:33
@natan: `1/f(s)` is problematic with methods that use derivatives (and other things as well); usually, simply `-f(s)` is the better choice. — Rody Oldenhuis, Jun 26 '13 at 12:48
@Phpdna: how do you know that? Suppose 75% of all combinations would be chopped away by the constraints. That leaves `(20^12)/4` or about **1 quadrillion** combinations. At a billion combinations per second, that means almost 12 days of number-crunching...doesn't seem very efficient to me. — Rody Oldenhuis, Jun 26 '13 at 12:52
Just to verify: 1) are your vectors indeed integers? If so, what kind (int8, uint64, etc.)? What's their *real* size, in your real program I mean? — Rody Oldenhuis, Jun 27 '13 at 05:24
The vectors are actually floats, however they only contain integers so rounding them to int8 is fine. They have 24 elements. — Johannes, Jun 27 '13 at 10:00

Rody Oldenhuis · Accepted Answer · 2013-06-27T07:16:48.383

Basically this is a lock-picking problem, where the lock's pins have 20 distinct positions, and there are 12 pins. Also:

some of the pin's positions will be blocked, depending on the positions of all the other pins.
Depending on the specifics of the lock, there may be multiple keys that fit

...interesting!

Based on Rasman's approach and Phpdna's comment, and the assumption that you are using int8 as data type, under the given constraints there are

>> d = double(intmax('int8'));
>> (d-40) * (d+100) * (d+20) * 2*d
ans =
    737388162

possible vectors s (give or take a few, haven't thought about +1's etc.). ~740 million evaluations of your relatively simple f(s) shouldn't take more than 2 seconds, and having found all s that maximize f(s), you are left with the problem of finding linear combinations in your vector set that add up to one of those solutions s.

Of course, this finding of combinations is no easy feat, and the whole method breaks down anyway if you are dealing with

int16:   ans = 2.311325368800510e+018
int32:   ans = 4.253529737045237e+037
int64:   ans = 1.447401115466452e+076

So, I'll discuss a more direct and more general approach here.

Since we're talking integers and a fairly large search space, I'd suggest using a branch-and-bound algorithm. But unlike the bintprog algorithm, you'd have to use different branching strategies, and of course, these should be based on a non-linear objective function.

Unfortunately, there is nothing like this in the optimization toolbox (or the File Exchange as far as I could find). fmincon is a no-go, since it uses gradient and Hessian information (which will usually be all-zero for integers), and fminsearch is a no-go, since you'll need a really good initial estimate, and the rate of convergence is (roughly) O(N), meaning, for this 20-dimensional problem you'll have to wait quite long before convergence, without the guarantee of having found the global solution.

An interval method could be a possibility, however, I personally have very little experience with this. There is no native interval-related stuff in MATLAB or any of its toolboxes, but there's the freely available INTLAB.

So, if you're not feeling like implementing your own non-linear binary integer programming algorithm, or are not in the mood for an adventure with INTLAB, there's really only one thing left: heuristic methods. In this link there is a similar situation, with an outline of the solution: use the genetic algorithm (ga) from the Global Optimization toolbox.

I would implement the problem roughly like so:

function [sol, fval, exitflag] = bintprog_nonlinear()

    %// insert your data here
    %// Any sparsity you may have here will only make this more 
    %// *memory* efficient, not *computationally*
    data = [... 
        ...  %// this will be an array with size 4-by-20-by-12
        ...  %// (or some permutation of that you find more intuitive)
        ];

    %// offsets into the 3D array to facilitate indexing a bit
    offsets = bsxfun(@plus, ...
        repmat(1:size(data,1), size(data,3),1), ...
        (0:size(data,3)-1)' * size(data,1)*size(data,2));   %//'

    %// your objective function
    function val = obj(X)

        %// limit "X" to integers in [1 20]
        X = min(max(round(X),1),size(data,3));

        %// "X" will be a collection of 12 integers between 0 and 20, which are 
        %// indices into the data matrix

        %// form "s" from "X"        
        s = sum(bsxfun(@plus, offsets, X*size(data,1) - size(data,1)));


        %// XxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxX        
        %// Compute the NEGATIVE VALUE of your function here
        %// XxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxX


    end

    %// your "non-linear" constraint function 
    function [C, Ceq] = nonlcon(X)

        %// limit "X" to integers in [1 20]
        X = min(max(round(X),1),size(data,3));

        %// form "s" from "X"        
        s = sum(bsxfun(@plus, offsets, X(:)*size(data,1) - size(data,1)));

        %// we have no equality constraints
        Ceq = [];

        %// Compute inequality constraints
        %// NOTE: solver is trying to solve C <= 0, so: 
        C = [...
            40 - s(1)
            s(2) - 100
            -20 - s(4)
        ];

    end

    %// useful GA options
    options = gaoptimset(...
        'UseParallel', 'always'...
        ...
    );

    %// The rest really depends on the specifics of the problem.
    %// Useful to look at will be at least 'TolCon', 'Vectorized', and of course, 
    %// 'PopulationType', 'Generations', etc.

    %// THE OPTIMZIATION 
    [sol, fval, exitflag] = ga(...
        @obj, size(data,3), ...  %// objective function, taking a vector of 20 values
        [],[], [],[], ...        %// no linear (in)equality constraints
        1,size(data,2), ...      %// lower and upper limits
        @nonlcon, options);      %// your "nonlinear" constraints


end

Note that even though your constraints are essentially linear, the way by which you must compute the value for your s necessitates the use of a custom constraint function (nonlcon).

Especially note that this is currently (probably) a sub-optimal way to use ga -- I don't know the specifics of your objective function, so a lot more may be possible. For instance, I currently use a simple round() to convert the input X to integers, but using 'PopulationType', 'custom' (with a custom 'CreationFcn', 'MutationFcn' etc.) might produce better results. Also, 'Vectorized' will likely speed things up a lot, but I don't know whether your function is easily vectorized.

And yes, I use nested functions (I just love those things!); it prevents these huge, usually identical lists of input arguments if you use sub-functions or stand-alone functions, and they can really be a performance boost because there is little copying of data. But, I realize that their scoping rules make them somewhat akin to goto constructs, and so they are -ahum- "not everyone's cup of tea"...you might want to convert them to sub-functions to prevent long and useless discussions with your co-workers :)

Anyway, this should be a good place to start. Let me know if this is useful at all.

score 0 · Answer 2 · answered Jun 26 '13 at 14:12

0

Unless you define some intelligence on how the vector sets are organized, there will be no intelligent way of solving your problem other then pure brute force.

Say you find s s.t. f(s) is max given constraints of s, you still need to figure out how to build s with twelve 4-element vectors (an overdetermined system if there ever was one), where each vector has 20 possible values. Sparsity may help, although I'm not sure how it is possible to have a vector with four elements be 70-90% zero, and sparsity would only be useful if there was some yet to be described methodology in how the vector are organized

So I'm not saying you can't solve the problem, I'm saying you need to rethink how the problem is set-up.

answered Jun 26 '13 at 14:12

Rasman

5,349
1
25
38

The fact that `s` should be a linear combination of given vectors (with all-unity coefficients) is in fact the toughest constraint in this problem. You could even say that that *is* the problem; the minimum of `f(s)` only a secondary consideration. – Rody Oldenhuis Jun 26 '13 at 14:34
About sparsity and the size of the vectors: My vectors have 24 elements, i just used smaller ones to make an example. I should probably have mentioned that. – Johannes Jun 26 '13 at 15:50

score 0 · Answer 3 · answered Apr 12 '15 at 05:49

I know, this answer is reaching you really late.

Unfortunately, the problem, as is, show not many patterns to be exploited, besides of brute force -Branch&Bound, Master& Slave, etc.- Trying a Master Slave approach -i.e. solving first the function continuous nonlinear problem as master, and solving the discrete selection as slave could help, but with as many combinations, and without any more information over the vectors, there is not too much space for work.

But based on the given continuous almost everywhere functions, based on combinations of sums and multiplication operators and their inverses, the sparsity is a clear point to be exploited here. If 70-90% of vectors are zero, almost a good part of the solution space will be close to zero, or close to infinite. Hence a 80-20 pseudo solution would discard easily the 'zero' combinations, and use only the 'infinite' ones.

This way, the brute-force could be guided.

Optimization with discrete parameters in Matlab

3 Answers3