Linear programming problem, searching for a group that has the most in common

Question

I have 6 matrixes of numbers with binary variables like this:

matrix 1:[0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 1, 0, 1]
matrix 2:[0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 1, 1]
matrix 3:[0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1]
matrix 4:[1, 1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1]
matrix 5:[1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 1, 1]
matrix 6:[1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1]

And I would like to know if there is a way in linear programming to find which matrixes of three has the most 1s in common.

So in this case it probably would groups 4,5 and 6, because they have 10 items in common:

[1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1]

I can't figure out a solution that would tell let's say lingo to find all the groups of three and give me the best one.

Any help would be greatly appreciated, I am willing to give a financial reward. Btw: Sorry for my bad engilsh.

How many total "matrices" do you have? (OBTW, I'd call those vectors as they are 1-dimensional, but for the solution to your problem, it likely won't matter) — AirSquid, May 02 '23 at 18:33
Yes, you're right, I should have called them vectors, I have 33 groups of vectors and each group has max 8 vectors and I want to solve each group separatley — Hrruuska, May 02 '23 at 18:56
Combinations(8, 3) is only 56 combinations to look at/evaluate. You could brute force this in many languages almost instantaneously and it would almost certainly be faster that spinning up a solver...unless you are really interested in going the LP route. — AirSquid, May 02 '23 at 21:00

Erwin Kalvelagen · Answer 1 · 2023-05-03T12:40:30.120

Let:

    i : rows
    j : columns
    d[i,j] : data matrix
    M : number of columns (21 in your example)
    k : number of rows that should match (k=3)

Define the following variables:

   binary variable pattern[j]  
   integer variable diff[i]  (differences wrt pattern in row i)
   binary variable ok[i]     (ok[i]=1 => diff[i]=0)

Model:

   max sum(j,pattern[j])
   subject to
       sum(j,d[i,j]*pattern[j]) + diff[i] = sum(j,pattern[j])  forall i
       diff[i] <= (1-ok[i])*M                                  forall i
       sum(i, ok[i]) >= k

This is not an LP, but rather a MIP due to the presence of the discrete variables.

Linear programming problem, searching for a group that has the most in common

1 Answers1