0

I'm trying to retrieve a set of similar images based on an input image. I'm using setting an array element with a sequence. setting an array element with a sequence. OpenCV for Python by the way. My strategy is that I get the SURF features of the database of images and then I put it into the k-NN model so that whenever I query an image by using the SURF Features and look for the similar set of items, I can just use k-NN to get the nearest neighbors. The problem is, I tried training the k-NN model in scikit-learn by putting the SURF descriptors and then flattening it. However, this error keeps on showing up whenever I try train the model. setting an array element with a sequence.setting an array element with a sequence.

What am I doing wrong? How should I represent the features so that I can use it with k-NN

UPDATE: Here's my code

SURFObject = cv2.SURF(hessianThreshold = 400, extended = 0)
image_names = []
image_descriptors = []
for i in range(1, 4001):
    print("Image Number: " + str(i))
    filename = 'cat.'+ str(i) +'.jpg'
    img = cv2.imread(filepath + filename)
    keypoints, descriptors = SURFObject.detectAndCompute(img, None)
    image_descriptors.append(descriptors.tolist())
    image_names.append(filename)

neighbors = NearestNeighbors(10, 0.5)
neighbors.fit(np.array(image_descriptors).reshape(-1,1))
Jessie
  • 31
  • 5

1 Answers1

0

I am not exactly sure of the error message you get but for sure you have a problem with the descriptors dimensions .

Surf first find key points for the image and then for each key point it generate a descriptor of a fixed size . the thing is that for each image you will get different number of key points and then when you do "descriptors.tolist() " it concatenate all the descriptors for this key point but you will get different size for each image

try reading about bag of words or even better fisher vector to deal with this kind of problem

M.Sabaa
  • 319
  • 2
  • 8
  • Does this mean that SURF and k-NN isn't feasible in terms of retrieving similar images? – Jessie Oct 23 '17 at 11:44
  • you can choose a fixed number of key points for each image that would also be feasible but depending on the nature of your images it might not give a good result – M.Sabaa Oct 23 '17 at 12:02
  • How exactly does Bag of Words fit with these descriptors? – Jessie Oct 23 '17 at 12:57
  • There is a lot to talk about how bag of words works which you can easily find a lot of reference for that – M.Sabaa Oct 24 '17 at 10:48
  • but in short you will need to generate a codebook (cluster your original descriptors using k-means ) from a training set of picture , and then for each new image the descriptor is kind of histogram for all the surf descriptors you already have – M.Sabaa Oct 24 '17 at 10:51