2

I have a database of 300 Images and I extracted for each of them a BOVW. Starting from a query image (with query_BOVW extracted from the same dictionary) I need to find similar images in my training dataset.

I used Sklearn KDTree on my training set kd_tree = KDTree(training) and then I calculate the distance from the query vector with kd_tree.query(query_vector). The last function takes as second parameter the number of nearest neighbours to return, but what I seek is to set a threshold for the euclidian distance and based on this threshold have different number of nearest neighbours.

I looked into the documentation but I did not find anything about that. Am I wrong seeking something that maybe does make no sense?

Thanks for the help.

Furin
  • 532
  • 10
  • 31

2 Answers2

1

You want to use query_radius here.

query_radius(self, X, r, count_only = False):

query the tree for neighbors within a radius r

...

Just the example from above link:

import numpy as np
np.random.seed(0)
X = np.random.random((10, 3))  # 10 points in 3 dimensions
tree = BinaryTree(X, leaf_size=2)     
print(tree.query_radius(X[0], r=0.3, count_only=True))

ind = tree.query_radius(X[0], r=0.3)  
print(ind)  # indices of neighbors within distance 0.3
Community
  • 1
  • 1
sascha
  • 32,238
  • 6
  • 68
  • 110
0

From the documentation, you can use the method query_radius:

Query for neighbors within a given radius:

import numpy as np
np.random.seed(0)
X = np.random.random((10, 3))  # 10 points in 3 dimensions
tree = KDTree(X, leaf_size=2)     
print(tree.query_radius(X[0], r=0.3, count_only=True))
ind = tree.query_radius(X[0], r=0.3) # indices of neighbors within distance 0.3 

This work with sklearn version 19.1

Furin
  • 532
  • 10
  • 31
Eolmar
  • 800
  • 1
  • 13
  • 27