18

I was thinking of a fast method to look for a submatrix m in a bigger mtrix M. I also need to identify partial matches.

Couple of approaches I could think of are :

  1. Optimize the normal bruteforce to process only incremental rows and columns.
  2. May be extend Rabin-karp algorithm to 2-d but not sure how to handle partial matches with it.

I believe this is quite frequently encountered problem in image processing and would appreciate if someone could pour in their inputs or point me to resources/papers on this topic.

EDIT: Smaller example:

Bigger matrix:
1 2 3 4 5
4 5 6 7 8
9 7 6 5 2

Smaller Matrix:
7 8
5 2

Result: (row: 1 col: 3)

An example of Smaller matrix which qualifies as a partial match at (1, 3):
7 9
5 2

If More than half of pixels match, then it is taken as partial match.

Thanks.

knowledgeSeeker
  • 237
  • 1
  • 2
  • 6
  • 5
    What are the restrictions the smaller matrix needs to obey? If none exist - just take the first m x n submatrix,... – amit May 10 '12 at 08:04
  • add a condition for your submatrixes? please. – gaussblurinc May 10 '12 at 08:22
  • added a condition for submatrices. – knowledgeSeeker May 10 '12 at 10:01
  • 2
    That's not a restriction. Just take the upper left square then. – harold May 10 '12 at 10:03
  • Harold, looking for 'a specific number' in a list doesn't mean i am looking for 'any number' in the list. I have added an example – knowledgeSeeker May 10 '12 at 10:07
  • What they mean is, how do you tell whether a matrix is a match in a certain position? "Partial match" means absolutely nothing. – svinja May 10 '12 at 10:08
  • I'm not sure why everyone finds this question so confusing. If I said, "how do I find a substring in a larger string", nobody would be confused - the implicit assumption is that the substring has some value that can be passed as a parameter. The OP is asking "how do I find an arbitrary sub-matrix (specified as a parameter) within a larger matrix"? – Charles Salvia May 10 '12 at 10:20
  • @CharlesSalvia Yes, but "I also need to identify partial matches." needs elaboration. That part is unclear. – Daniel Fischer May 10 '12 at 10:30
  • Possible duplicate [how-to-extract-a-2x2-submatrix-from-a-bigger-matrix][1] [1]: http://stackoverflow.com/questions/2797767/how-to-extract-a-2x2-submatrix-from-a-bigger-matrix – acraig5075 May 10 '12 at 10:33
  • 1
    I have added some examples. The question should be pretty clear now. – knowledgeSeeker May 10 '12 at 10:37
  • Not a duplicate of that. But will be hard to find an efficient solution, I don't think you will be able to adapt any of the string search "tricks" as you can't make any decisions based on a single element not matching (due to having partial matches which can have up to half non-matching elements). – svinja May 10 '12 at 11:35
  • The more context, the more suitable a solution you can expect to get. Is m x n << M x N? Is m x n >> 1 x 1? Do you typically expect to match many different submatrices to one matrix, one submatrix against many matrices or one submatrix on one matrix? One little assumption can turn an NP-hard problem into a log(N) one. – smocking May 11 '12 at 13:07
  • @smocking i have to matcg one submatrix on one matrix. For simplicity assume MxN to be square matrix, 1024X1024, smaller one to be 4x4 – knowledgeSeeker May 14 '12 at 09:36

4 Answers4

2

I recommend doing an internet search on "2d pattern matching algorithms". You'll get plenty of results. I'll just link the first hit on Google, a paper that presents an algorithm for your problem.

You can also take a look at the citations at the end of the paper to get an idea of other existing algorithms.

The abstract:

An algorithm for searching for a two dimensional m x m pattern in a two dimensional n x n text is presented. It performs on the average less comparisons than the size of the text: n^2/m using m^2 extra space. Basically, it uses multiple string matching on only n/m rows of the text. It runs in at most 2n^2 time and is close to the optimal n^2 time for many patterns. It steadily extends to an alphabet-independent algorithm with a similar worst case. Experimental results are included for a practical version.

MicSim
  • 26,265
  • 16
  • 90
  • 133
1

There are very fast algorithms for this if you are willing to preprocess the matrix and if you have many queries for the same matrix.

Have a look at the papers on Algebraic Databases by the Research group on Multimedia Databases (Prof. Clausen, University of Bonn). Have a look at this paper for example: http://www-mmdb.iai.uni-bonn.de/download/publications/sigir-03.pdf

The basic idea is to generalize inverted list, so they use any kind of algebraic transformation, instead of just shifts in one direction as with ordinary inverted lists.

This means that this approach works whenever the modifications you need to do to the input data can be modelled algebraically. This specifically that queries which are translated in any number of dimensions, rotated, flipped etc can all be retrieved.

The paper is mainly showing this for musical data, since this is their main research interest, but you might be able to find others, which show how to adapt this to image data as well (or you can try to adapt it yourself, if you understand the principle it's quite simple).

Edit:

This idea also works with partial matches, if you define them correctly.

LiKao
  • 10,408
  • 6
  • 53
  • 91
0

There is no way to do this fast if you only ever need to match one small matrix against one big matrix. But if you need to do many small matrices against big matrices, then preprocess the big matrix.

A simple example, exact match, many 3x3 matrices against one giant matrix.

Make a new "match matrix", same size as "big matrix", For each location in big matrix compute a 3x3 hash for each x,y to x+3,y+3 in big matrix. Now you just scan the match matrix for matching hashes.

You can achieve partial matches with specialized hash functions that give the same hash to things that have the same partial matching properties. Tricky.

If you want to speed up further and have memory for it, create a hash table for the match matrix, and lookup the hashes in the hash table.

The 3x3 solution will work for any test matrix 3x3 or larger. You don't need to have a perfect hash method - you need just something that will reject the majority of bad matches, and then do a full match for potential matches in the hash table.

Rafael Baptista
  • 11,181
  • 5
  • 39
  • 59
0

I think you cannot just guess where the submatrix is with some approach, but you can optimize your searching.

For example, given a matrix A MxN and a submatrix B mxn, you can do like:

SearchSubMatrix (Matrix A, Matrix B)

answer = (-1, -1)

Loop1:
for i = 0 ... (M-m-1)
|
|   for j = 0 ... (N-n-1)
|   | 
|   |   bool found = true
|   |
|   |   if A[i][j] = B[0][0] then
|   |   |
|   |   |   Loop2:
|   |   |   for r = 0 ... (m-1)
|   |   |   |   for s = 0 ... (n-1)
|   |   |   |   |   if B[r][s] != A[r+i][s+j] then
|   |   |   |   |   |   found = false
|   |   |   |   |   |   break Loop2
|   |
|   |   if found then
|   |   |   answer = (i, j)
|   |   |   break Loop1
|
return answer

Doing this, you will reduce your search in the reason of the size of the submatrix.

Matrix         Submatrix         Worst Case:
1 2 3 4           2 4            [1][2][3] 4
4 3 2 1           3 2            [4][3][2] 1
1 3 2 4                          [1][3]{2  4}
4 1 3 2                           4  1 {3  2}

                                 (M-m+1)(N-n+1) = (4-2+1)(4-2+1) = 9

Although this is O(M*N), it will never look M*N times, unless your submatrix has only 1 dimension.

Peter O.
  • 32,158
  • 14
  • 82
  • 96
Daniel
  • 7,357
  • 7
  • 32
  • 84