7

I give quotation mark because what I mean is for example:

B = [[1,2,3,4,5],
     [6,7,8,9,10],
     [11,12,13,14,15],
     [16,17,18,19,20]]

suppose we select row 2,4 and col 1,3, the intersections will give us

A = [[6,8],
     [16,18]]

My question is suppose I have A and B, is there a way that I can find out which rows and cols are selected from B to give A?

By the way it would be best if you can give the answer in python/numpy. But just in math or in other programming language will be fine as well.

Thibaut
  • 1,398
  • 10
  • 16
CJJ
  • 139
  • 1
  • 7
  • Is `[[6,8,9],[16,18,19]]` a valid sub matrix? (In other words, are you potentially looking for a list of indices on each axis, or only a start/stop/step slice?) – abarnert Nov 12 '14 at 21:19
  • 1
    Also, should we assume you can figure out the obvious brute force solution, so it's not worth mentioning (at least without an argument that nothing more efficient is possible)? – abarnert Nov 12 '14 at 21:20
  • Do you assume that rows/columns can be repeated in the matrix `A`? Do you assume that the rows and columns appear in the same order in `B` and in `A`? – Thibaut Nov 12 '14 at 22:46
  • You really don't need the quotes, as what you describe here is actually called a [submatrix](http://en.wikipedia.org/wiki/Matrix_(mathematics)#Submatrix) – chthonicdaemon Nov 13 '14 at 16:06

4 Answers4

10

This is a very hard combinatorial problem. In fact the Subgraph Isomorphism Problem can be reduced to your problem (in case the matrix A only has 0-1 entries, your problem is exactly a subgraph isomorphism problem). This problem is known to be NP-complete.

Here is a recursive backtracking solution which does a bit better than brute-forcing all possible solutions. Note that this still takes exponential time in the worst case. However, if you assume that a solution exists and that there are no ambiguities (for example that all the entries in B are distinct), this finds the solution in linear time.

def locate_columns(a, b, offset=0):
    """Locate `a` as a sublist of `b`.

    Yields all possible lists of `len(a)` indices such that `a` can be read
    from `b` at those indices.
    """
    if not a:
        yield []
    else:
        positions = (offset + i for (i, e) in enumerate(b[offset:])
                     if e == a[0])
        for position in positions:
            for tail_cols in locate_columns(a[1:], b, position + 1):
                yield [position] + tail_cols


def locate_submatrix(a, b, offset=0, cols=None):
    """Locate `a` as a submatrix of `b`.

    Yields all possible pairs of (row_indices, column_indices) such that
    `a` is the projection of `b` on those indices.
    """
    if not a:
        yield [], cols
    else:
        for j, row in enumerate(b[offset:]):
            if cols:
                if all(e == f for e, f in zip(a[0], [row[c] for c in cols])):
                    for r, c in locate_submatrix(a[1:], b, offset + j + 1, cols):
                        yield [offset + j] + r, c
            else:
                for cols in locate_columns(a[0], row):
                    for r, c in locate_submatrix(a[1:], b, offset + j + 1, cols):
                        yield [offset + j] + r, c

B = [[1,2,3,4,5], [6,7,8,9,10], [11,12,13,14,15], [16,17,18,19,20]]
A = [[6,8], [16,18]]

for loc in locate_submatrix(A, B):
    print loc

This will output:

([1, 3], [0, 2])
Thibaut
  • 1,398
  • 10
  • 16
  • But it doesn't work when B = [[1,6,3,8,5], [6,7,8,9,10], [11,12,13,14,15], [16,17,18,19,20]] – CJJ Nov 14 '14 at 03:02
  • Do you keep `A` to still be `[[6,8], [16,18]]`? In this case `A` is not a submatrix of `B` and the program does not output anything. Which is the expected output. – Thibaut Nov 14 '14 at 03:04
  • @Thibaut It actually forms a submatrix if you take the second and forth rows and columns of `B`. – ABu Sep 08 '20 at 16:00
1

If all you want to know is which rows and cols are selected from B to give A and do not care about efficiency here is a brute force way that stores the results in an array res res[N] tells all locations of A[N] in B. Works even if A[N] exists in multiple locations of B.

B = [[1,2,3,4,5],[6,7,8,9,10],[11,12,13,14,15],[16,17,18,19,20]]
A = [[6,8], [16,18]]
res = []

for subsetIndex, subset in enumerate(A):
    k = []
    res.append(k)
    for supersetIndex, superset in enumerate(B):
        loc = []
        try:
            loc = [(supersetIndex, superset.index(item)) for item in subset]
            k.append(loc)
            print A[subsetIndex], "is at ", loc, "in B"
        except ValueError:
            pass
print res

output:

[6, 8] is at  [(1, 0), (1, 2)] in B
[16, 18] is at  [(3, 0), (3, 2)] in B
result =  [[[(1, 0), (1, 2)]], [[(3, 0), (3, 2)]]]
user3885927
  • 3,363
  • 2
  • 22
  • 42
1

Are all/most of the values in the matrix unique IE: They only appear once in matrix B?

The more unique the values the better the improvement you can make over subgraph isomorphism (SI). If all values are unique, then you can just do a reverse look-up on each value to determine it's row/column pair, union the list of rows and columns (separately).

Result is a simple O(N) algorithm, where N = number of rows * number of columns. Of course, the less unique the values, the more false positives you get that need checking and the closer you get to SI and the less simple things get.

Nuclearman
  • 5,029
  • 1
  • 19
  • 35
0

Here's a brute-force solution, if that's all you need:

rows = [i for aa in A for i,bb in enumerate(B) if np.in1d(aa, bb).all()]
cols = [i for aa in A.T for i,bb in enumerate(B.T) if np.in1d(aa, bb).all()]

submatrix = B[np.ix_(rows, cols)]

It's checking every row of A against every row of B to make sure all elements of the submatrix are present. Then, it does the same thing for the columns.

You could speed up the column-finding part by restricting it to only the relevant rows:

cols = [i for aa in A.T for i,bb in enumerate(B[rows].T) if np.equal(aa, bb).all()]
perimosocordiae
  • 17,287
  • 14
  • 60
  • 76