0

I have a scanned image document like this: document sample

That image was scanned using an ordinary scan engine so it is possible for the document to be skewed. the image has been carried out a binarization process, so there is still a little noise. I want to know where the location of this template is in the picture. this is the template:

the template I'm looking for

my expected result was the location coordinate of the template inside the image document, in array form like this:

[[35,1532], [1923,20], [1923,1532]]

I need clues if the results are correct like adding boxes around the template matched I've tried this code :

img = cv2.imread('image_document.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
template = cv2.imread('template.jpg',0)

# run template matching, get minimum val
res = cv2.matchTemplate(gray, template, cv2.TM_SQDIFF_NORMED)
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(res)

# create threshold from min val, find where sqdiff is less than thresh
min_thresh = (min_val) * 1.5
match_locations = np.argwhere(res<=min_thresh)

# draw template match boxes
w, h = template.shape[::-1]
for (x, y) in zip(match_locations[1], match_locations[0]):
    cv2.rectangle(img, (x, y), (x+w, y+h), [0,255,255], 2)

# display result
cv2.imwrite('result.jpg', img)

but the actual result was the rectangle is too big and does not match with the template

ircham
  • 129
  • 13
  • 1
    I believe you want np.argwhere() not np.where. The latter returns the pixel values. The former returns the coordinates. – fmw42 Apr 10 '20 at 03:42
  • @fmw42 right, I've changed it and I got a rectangle in result.jpg. But the rectangle was too big and doesn't match with the template – ircham Apr 10 '20 at 03:48
  • Numpy returns coordinates as y,x. So you want to swap them for using them in OpenCV, which wants coordinates as x,y. Also shape returns values as h,w and not w,h. So be careful how you use Numpy and OpenCV – fmw42 Apr 10 '20 at 05:22
  • @fmw42 okay then. but the box is still in the top left corner and the size does not change when I swap x,y, and w,h – ircham Apr 10 '20 at 07:23
  • 1
    Your template image is much larger than the size of the corresponding markings in the input image. Template matching as you have it coded needs to have the template icon the same size as in the input image. Or you will need to use multi-scale template matching. Search Google. See for example https://www.pyimagesearch.com/2015/01/26/multi-scale-template-matching-using-python-opencv/. – fmw42 Apr 10 '20 at 18:56
  • @fmw2 thanks, is there is any suggestion about skewed input image? – ircham Apr 11 '20 at 09:31
  • your pattern looks like a QR code locator. If you're using this pattern and not some any arbitrary one you can use a QR code alignment algorithm, see here for example: https://github.com/MikhailGordeev/QR-Code-Extractor – user2999345 Apr 11 '20 at 13:38
  • @user2999345 is that yours? I have trouble understanding the terms used – ircham Apr 11 '20 at 14:14
  • Basic template matching is sensitive to scale, rotation, skew. It searches only for offsets. You can do the same as multi-resolution template matching, but rotating or skewing your template. There does exist affine template matching that will work for all of those distortions, but they are not implemented in OpenCV. – fmw42 Apr 11 '20 at 16:57
  • @fmw42 do you have the reference? – ircham Apr 11 '20 at 17:02
  • Reference to what? – fmw42 Apr 11 '20 at 17:08
  • @fmw42 ah, I misunderstood. u suggest affine transformation? yeah i will do that after i get the three coordinates as a parameter for affine transform – ircham Apr 11 '20 at 17:17
  • See http://www.google.com/url?q=http://www.kky.zcu.cz/en/publications/1/SudhakarSah_2012_GPUAcceleratedReal.pdf&sa=U&ved=0ahUKEwi5lP7N_IfPAhUCz2MKHVkxBYsQFggUMAA&usg=AFQjCNEzVLPZkJftp91aSyr8wtMHbdgs2w and http://www.google.com/url?q=http://www.kky.zcu.cz/en/publications/2/SudhakarSah_2012_GPUAcceleratedReal.pdf&sa=U&ved=0ahUKEwi4mPik_YfPAhUKHGMKHeiaCJkQFggkMAM&usg=AFQjCNHSNUCG4rGPgspN1bJZ3476UmjptQ – fmw42 Apr 11 '20 at 17:20

1 Answers1

0

I've had a similar problem, this is how I've solved it.

import numpy as np
import cv2

#Reading the image that we want to find the match
img = cv2.imread('image_document.jpg')
#Reading what we want to match with
template = cv2.imread('template.jpg',0)

#Identifying the location of the divisions we are seeking
#I'm using TM_CCOEFF_NORMED because it is the best for my problem
#But you could test different algorithms 
result = cv2.matchTemplate(img, template, cv2.TM_CCOEFF_NORMED) 

#Expressing those locations in variables
#This will give you the best match, not all matches.
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(result)                                                           

#You have to find the right threshold to your problem
#Mine was 0.95, which has to be really similar, but you could need 0.6, 0.4 other value for yours.
#If the threshold is equal to 1, then you'll have to have a 100% match
threshold = 0.95

#This will give you the X and Y locations of the images you are seeking.
#If you have too many locations that means your threshold isn't quite right
yloc, xloc = np.where(result >= threshold)
w, h = template.shape[::-1]

#This will give you the location of the rectangles you are seeking
#We should expect 3 lists inside according to your problem
rectangles = []
# draw template match boxes
for (x, y) in zip(yloc, xloc):
    cv2.rectangle(img, (x, y), (x+w, y+h), [0,255,255], 2)
    rectangles.append([x,y,w,h])

# display result
cv2.imwrite('result.jpg', img)
print(rectangles)

Hope this helps, it did help my problem. And something else. You should cut the image you are trying to find from the original image, it will help to match. And for the Threshold, I don't if for your problem getting it from the min_val is the best choice, you could try find the best one for you.