6

I'm trying to follow this tutorial with my own images.However, the results I get are not exeacty what I'd exepct. Am I missing something here, or is SIFT just not a good enough solution in this case? Thanks alot.

import numpy as np
import cv2
from matplotlib import pyplot as plt

MIN_MATCH_COUNT = 10

img1 = cv2.imread('Q/IMG_1192.JPG', 0)          # queryImage
img2 = cv2.imread('DB/IMG_1208-1000.jpg', 0) # trainImage

# Initiate SIFT detector
sift = cv2.xfeatures2d.SIFT_create()

# find the keypoints and descriptors with SIFT
kp1, des1 = sift.detectAndCompute(img1,None)
kp2, des2 = sift.detectAndCompute(img2,None)



FLANN_INDEX_KDTREE = 0
index_params = dict(algorithm = FLANN_INDEX_KDTREE, trees = 5)
search_params = dict(checks = 50)

flann = cv2.FlannBasedMatcher(index_params, search_params)

matches = flann.knnMatch(des1, des2, k=2)

# store all the good matches as per Lowe's ratio test.
good = []
for m,n in matches:
    if m.distance < 0.7*n.distance:
        good.append(m)


if len(good)>MIN_MATCH_COUNT:
    src_pts = np.float32([ kp1[m.queryIdx].pt for m in good ]).reshape(-1,1,2)
    dst_pts = np.float32([ kp2[m.trainIdx].pt for m in good ]).reshape(-1,1,2)

    M, mask = cv2.findHomography(src_pts, dst_pts, cv2.RANSAC,5.0)
    matchesMask = mask.ravel().tolist()

    h,w, = img1.shape
    pts = np.float32([ [0,0],[0,h-1],[w-1,h-1],[w-1,0] ]).reshape(-1,1,2)
    dst = cv2.perspectiveTransform(pts,M)

    img2 = cv2.polylines(img2,[np.int32(dst)],True,255,3, cv2.LINE_AA)

else:
    print ("Not enough matches are found - %d/%d" % (len(good),MIN_MATCH_COUNT))
    matchesMask = None

draw_params = dict(matchColor = (0,255,0), # draw matches in green color
                   singlePointColor = None,
                   matchesMask = matchesMask, # draw only inliers
                   flags = 2)

img3 = cv2.drawMatches(img1,kp1,img2,kp2,good,None,**draw_params)

plt.imshow(img3, 'gray'),plt.show()

Original images :

And the result:

Alex
  • 85
  • 1
  • 8

2 Answers2

2

From the images provided I figured out that SIFT won't figure out key features for images that are extremely big. Consider the individual car image, it is 1728 pixels in width and 2304 pixels in height. This is to big. The other image has a pretty normal size with the car occupying a smaller region.

The certain features expected to be matched would be the rims on the wheels, corners on the windows, corners around the bonnet, etc. But in an enlarged image such as the one provided there are no distinct corners, instead there are more of edges present. SIFT looks out for features points that are distinct in nature (with corners in particular).

After resizing the car image to dimension (605 x 806) and the other image to dimension (262 x 350), there was one correct match found in the following figure (notice the match near the wheel):

enter image description here

I tried out the same code for another bunch of images having some letters and drawings in it. Here is the result for that:

enter image description here

Jeru Luke
  • 20,118
  • 13
  • 80
  • 87
  • Hi, thanks for your asnwer, do you think SIFT is simply not the right way to go here? I've tried scaling down the images but the number of the correct matches together with 'false postives' is not really enough for my purposes. – Alex Aug 06 '18 at 10:19
  • 1
    SIFT is by far the best feature descriptor out there as it is invariant to geometric and affine transformations. The problem as I see is the images where there are not enough features present like corners, edges and so on. The image that I provided has many letters and an image containing such corners hence SIFT was able to perform well. – Jeru Luke Aug 06 '18 at 10:57
  • More recent descriptors give much better results than SIFT. Many are already implement in OpenCV's feature2d and xfeature2d modules. – John_Sharp1318 Aug 06 '18 at 13:19
  • 1
    Trying other descriptors is not really an option since I have to use SIFT, however I tried playing around with the sift_create() parameters and some variations have given me better results with this specific image(after scaling it down). – Alex Aug 06 '18 at 14:16
  • 1
    @JeruLukecv2.xfeatures2d.SIFT_create(sigma = 1.2 ,contrastThreshold = 0.01, edgeThreshold = 20) gave me some good recognition for the wheels of the toy car. – Alex Aug 08 '18 at 13:05
0

In order to evaluate if the issue come from SIFT descriptor I would advise you use another descriptor such as cv2.xfeatures2d.VGG_create() or cv2.BRISK_create(). Take also a look to cv2.xfeatures2d.matchGMS it may give much better results even with SIFT descriptor.

From my personal experience among the possible reasons that can justify the lack of accuracy in the context of your application of the SIFT algorithm would be the sensitivity to gradient reversal. SIFT descriptor does contains a quantified histogram of the normalised gradient orientation surrounding a keypoint. The thing is if the intensity in a region move from a lighter to a darker pixel (e.g. 255->127) the gradient orientation will be different than if the intensities move from a darker to lighter (e.g.127->255).

John_Sharp1318
  • 939
  • 8
  • 19
  • So the problem might be the position of shadows depending on the angle of the photograph? – Alex Aug 06 '18 at 10:21
  • Among other causes yes. If real time processing is not mandatory for your application I still suggest to try more recent image descriptors, they are more accurate most of the time. – John_Sharp1318 Aug 06 '18 at 13:16