2

I understand that this is a popular question on Stack Overflow however, I have not managed to find the best solution yet.

Background

I am trying to classify an image. I currently have 10,000 unique images that a given image can match with. For each image in my database, I only have a single image for training. So I have a DB of 10,000 and the possible output classes are also 10,000. e.g. lets say there are 10,000 unique objects and I have a single image for each.

The goal is to match an input image to the 'best' matching image in the DB.

I am currently using Python with OpenCV and the Sift library to identify keypoints / descriptors then applying the standard matching methods to see which image in the DB that the input image best matches.

Code

I am using the following code to iterate over my database of images, to then find all the key points / descriptors and saving those descriptors to a file. This is to save time later on.

for i in tqdm(range(labels.shape[0])): #Use the length of the DB

    # Read img from DB
    img_path = 'data/'+labels['Image_Name'][i]
    img = cv2.imread(img_path) 

    # Resize to ensure all images are equal for ROI
    dim = (734,1024)
    img = cv2.resize(img, dim, interpolation = cv2.INTER_AREA)

    #Grayscale
    img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    #Roi
    img = img[150:630, 20:700]

    # Sift
    sift = cv2.xfeatures2d.SIFT_create()
    keypoints_1, descriptors_1 = sift.detectAndCompute(img,None)

    # Save descriptors
    path = 'data/'+labels['Image_Name'][i].replace(".jpeg", "_descriptors.csv")
    savetxt(path, descriptors_1, delimiter=',')

Then when I am ready to classify an image, I can then read in all of the descriptors. This has proven to be 30% quicker.

# Array to store all of the descriptors from SIFT
descriptors = []

for i in tqdm(range(labels.shape[0])): #Use the length of the DB

    # Read in teh descriptor file
    path = 'data/'+labels['Image_Name'][i].replace(".jpeg", "_descriptors.csv")
    descriptor = loadtxt(path, delimiter=',')

    # Add to array
    descriptors.append(descriptor)

Finally, I just need to read in an image, apply the sift method and then find the best match.

# Calculate simaularity 
img = cv2.imread(PATH)

# Resize
dim = (734,1024)
img = cv2.resize(img, dim, interpolation = cv2.INTER_AREA)

#Grayscale
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

#Roi
img = img[150:630, 20:700]

# Sift
sift = cv2.xfeatures2d.SIFT_create()
keypoints_1, descriptors_1 = sift.detectAndCompute(img,None)

# Use Flann (Faster)
index_params = dict(algorithm=0, trees=5)
search_params = dict()
flann = cv2.FlannBasedMatcher(index_params, search_params)

# Store results
scoresdf = pd.DataFrame(columns=["index","score"])

#Find best matches in DB
for i in tqdm(range(labels.shape[0])):
    # load in data
    path = 'data/'+labels['Image_Name'][i].replace(".jpeg", "_descriptors.csv")

    # Get descriptors for both images to compare
    descriptors_2 = descriptors[i]
    descriptors_2 = np.float32(descriptors_2)

    # Find matches
    matches = flann.knnMatch(descriptors_1, descriptors_2, k=2)

    # select the lowest amount of keypoints
    number_keypoints = 0
    if len(descriptors_1) <= len(descriptors_2):
        number_keypoints = len(descriptors_1)
    else:
        number_keypoints = len(descriptors_2)

    # Find 'good' matches LOWE
    good_points = []
    ratio = 0.6
    for m, n in matches:
        if m.distance < ratio*n.distance:
            good_points.append(m)

    # Get simularity score
    score = len(good_points) / number_keypoints * 100
    scoresdf.loc[len(scoresdf)] = [i, score]

This all works but it does take some time and I would like to find a match much quicker.

Solutions?

I have read about the bag of word (BOW) method. However, I do not know if this will work given there are 10,000 classes. Would I need to set K=10000?

Given that each descriptor is an array, is there a way to reduce my search space? Can I find the X closest arrays (descriptors) to the descriptor of my input image?

Any help would be greatly appreciated :)

Edit

Can you use a Bag of Words (BOW) method to create X clusters. Then when I read in a new image, find out which cluster it belongs to. Then use SIFT matching on the images in that cluster to find the exact match? I am struggling to find much code examples for this.

brian4342
  • 1,265
  • 8
  • 33
  • 69

0 Answers0