How to make SIFT more robust to the transformation of an object on the image?

Question

Based on the tutorial Feature Matching + Homography to find Objects I noted a difficulty of SIFT to adapt to the transformations of an object as the size or the partial obstruction. On the image of the second result we even can see that there is no inlier (relevant) match.

Is there a way to solve this problem, without using deep learning methods? The other algorithms ORB/SURF/FAST do not seem to give satisfaction either.

Here is the code and images used for the example below.

import numpy as np
import cv2
from matplotlib import pyplot as plt
MIN_MATCH_COUNT = 10

img1 = cv2.imread('starbucks/Starbucks_Corporation_Logo_2011.png', 0)  # queryImage
img2 = cv2.imread('starbucks/870x489_maxnewsworldfive046642.jpg', 0)  # trainImage

# Initiate SIFT detector
sift = cv2.SIFT_create()

#find the keypoints and descriptors with SIFT
kp1, des1 = sift.detectAndCompute(img1, None)
kp2, des2 = sift.detectAndCompute(img2, None)

FLANN_INDEX_KDTREE = 0
index_params = dict(algorithm=FLANN_INDEX_KDTREE, trees=5)

search_params = dict(checks=50)

flann = cv2.FlannBasedMatcher(index_params, search_params)

matches = flann.knnMatch(des1, des2, k=2)

good = [m for m, n in matches if m.distance < 0.7*n.distance]  

if len(good)>MIN_MATCH_COUNT:
        src_pts = np.float32([ kp1[m.queryIdx].pt for m in good ]).reshape(-1,1,2)
        dst_pts = np.float32([ kp2[m.trainIdx].pt for m in good ]).reshape(-1,1,2)
    
        M, mask = cv2.findHomography(src_pts, dst_pts, cv2.RANSAC,5.0)
        matchesMask = mask.ravel().tolist()

        h,w = img1.shape
        pts = np.float32([ [0,0],[0,h-1],[w-1,h-1],[w-1,0] ]).reshape(-1,1,2)
        dst = cv2.perspectiveTransform(pts,M)

        img2 = cv2.polylines(img2,[np.int32(dst)],True,255,3, cv2.LINE_AA)

else:
    print(f"Not enough matches are found - {len(good)}/{MIN_MATCH_COUNT}")
    matchesMask = None
    

draw_params = dict(matchColor = (0,255,0), # draw matches in green color
                   singlePointColor = None,
                   matchesMask = matchesMask, # draw only inliers
                   flags = 2)

img3 = cv2.drawMatches(img1,kp1,img2,kp2,good,None,**draw_params)
plt.imshow(img3, 'gray'),plt.show()

The icon logo isnt perfect for sift, a textured objecf should work better. In addition, the impressive scale and rotation invariance of SIFT has limits. You can use a set/database of template images with different perspectives, to handle hat. — Micka, Aug 03 '22 at 14:17
@Micka Thanks for your suggestions, and let's imagine that it is a flat unique logo in .png, for which there is no database? — Tim, Aug 03 '22 at 14:58
@Tim v/interesting problem. Please update the page if you find a solution — Juancheeto, Aug 03 '22 at 17:42
@Tim just create a database on your own: Start with the straight logo and then take pictures from different angles, until sift matching fails. Add that picture to your database. Repeat that process. — Micka, Aug 03 '22 at 19:06
Thanks @Micka I tried several things, what you propose in your solution after making my own dataset is training via a neural network, or you thought of something else? — Tim, Aug 10 '22 at 14:00
Sift keypoints of all your reference images to one object. Then for a query image, compare the keypoints to all the keypoints of the whole object. — Micka, Aug 10 '22 at 15:14
Thanks again, I'm doing that I'll get back to you when it's done with the result. — Tim, Aug 11 '22 at 16:25
@Micka Sift keypoints of all your reference images **to one object**. Can you detail this part? Is it taking all the keypoints, descriptors of all the images in my dataset and comparing them successively with the test image via flann.knnMatch? — Tim, Aug 15 '22 at 15:43
Imagine for example if you have an image from the front of the object and and image from the back of an object. A new image will not match to both of them at the same time, but might mstch to one of them. If it matches to one of them, there is the object present in the image. instead of using only front an back, use any number of images of different angles which are nevessary. As long as your query image matches to one of the "model" images you have your object detected. — Micka, Aug 15 '22 at 16:55

score 1 · Answer 1 · answered Sep 21 '22 at 11:01

1

This viewpoint change is too big for SIFT. Try something like AffNet (https://kornia-tutorials.readthedocs.io/en/latest/image_matching_adalam.html) or ASIFT https://ipolcore.ipol.im/demo/clientApp/demo.html?id=63

answered Sep 21 '22 at 11:01

old-ufo

2,799
2
28
40

How to make SIFT more robust to the transformation of an object on the image?

1 Answers1