A k-d-tree (k-dimensional tree) is a data structure for storing points in multidimensional space. They can be used to efficiently query for whether a point exists, as well as to do Euclidean nearest-neighbor searches and searches inside of hyperdimensional rectangular regions.
Questions tagged [kdtree]
457 questions
13
votes
5 answers
Why are KD-trees so damn slow for nearest neighbor search in point sets?
I am using CGAL's (the latest) KD-tree implementation for searching nearest neighbors in point sets. And also Wikipedia and other resources seem to suggest that KD-trees are the way to go. But somehow they are too slow and Wiki also suggests their…

thesaint
- 1,253
- 1
- 12
- 22
12
votes
4 answers
Efficient method for finding KNN of all nodes in a KD-Tree
I'm currently attempting to find K Nearest Neighbor of all nodes of a balanced KD-Tree (with K=2).
My implementation is a variation of the code from the Wikipedia article and it's decently fast to find KNN of any node O(log N).
The problem lies…

St. John Johnson
- 6,590
- 7
- 35
- 56
11
votes
3 answers
Quad tree and Kd tree
I have a set of latitude and longitude for various locations and also know the latitude and longitude of my current location. I have to find out the nearest places from current location.
Which algorithm is best one from Kdtree and quadtree to find…

Kesiya Abraham
- 753
- 3
- 10
- 24
11
votes
3 answers
Is kd-tree always balanced?
I have used kd-tree algoritham and make tree.
But i found that tree is not balanced so my question is if we used kd-tree algoritham then that tree is always balanced if not then how can we make it balance ?.
We can use another algoritham likes…

Logicbomb
- 531
- 1
- 6
- 20
10
votes
4 answers
How to find the closest pairs (Hamming Distance) of a string of binary bins in Ruby without O^2 issues?
I've got a MongoDB with about 1 million documents in it. These documents all have a string that represents a 256 bit bin of 1s and 0s, like:
0110101010101010110101010101
Ideally, I'd like to query for near binary matches. This means, if the two…

Williamf
- 595
- 4
- 14
10
votes
1 answer
Connected components on Pandas DataFrame with Networkx
Action
To cluster points based on distance and label using connected components.
Problem
The back and forth switching between NetworkX nodes storage of attributes and Pandas DataFrame
Seems too complex
Index/key errors when looking up…

Tom Hemmes
- 2,000
- 2
- 17
- 23
9
votes
1 answer
Tuning leaf_size to decrease time consumption in Scikit-Learn KNN
I was trying to implement KNN for handwritten character recognition where I found out that the execution of code was taking a lot of time. When added parameter leaf_size with value 400, I observed that time taken by code to execute was significantly…

Harshit Saini
- 368
- 1
- 3
- 10
9
votes
0 answers
how to add/remove data points to/from a scikit-learn KD-Tree?
I am wondering if it is possible to add or remove data points from a scikit-lern KD-Tree instance after its creation ?
For example:
from sklearn.neighbors import KDTree
import numpy as np
X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1],…

Yahia
- 1,209
- 1
- 15
- 18
9
votes
2 answers
Is k-d tree efficient for kNN search. k nearest neighbors search
I have to implement k nearest neighbors search for 10 dimensional data in kd-tree.
But problem is that my algorithm is very fast for k=1, but as much as 2000x slower for k>1 (k=2,5,10,20,100)
Is this normal for kd trees, or am I doing something…

Andraz
- 709
- 2
- 7
- 16
9
votes
3 answers
efficient way to handle 2d line segments
I am having huge set of 2D line segments. So, I know; Line number,
Begin (X,Y,Z) and End (x,Y,Z) of each line segment. I want to get
proximity line segments for a given line segment. Likewise for all.
To find the proximity I can apply this
If I…

gnp
- 257
- 3
- 8
9
votes
2 answers
What spatial indexing algorithm should I use?
I want to implement some king of spatial indexing data structure for my MKAnnotations.
Currently it's horribly slow when I try to filter them based on distance criteria ( 3-4k of locations, currently extremely slow with a simple double for ...…

Templar
- 1,694
- 1
- 14
- 32
8
votes
1 answer
How do I use kd-trees for determining string similarity?
I am trying to utilize k-nearest neighbors for the string similarity problem i.e. given a string and a knowledge base, I want to output k strings that are similar to my given string. Are there any tutorials that explain how to utilize kd-trees to…

Legend
- 113,822
- 119
- 272
- 400
8
votes
2 answers
Remove root from k-d-Tree in Python
For someone who is new to python, I don't understand how to remove an instance of a class from inside a recursive function.
Consider this code of a k-d Tree:
def remove(self, bin, targetAxis=0, parent=None):
if not self:
return None
elif…

greedsin
- 1,252
- 1
- 24
- 49
8
votes
3 answers
How to best store lines in a kd-tree
I know kd-trees are traditionally used to store points, but I want to store lines instead. Would it be best to split the line at every intersection with the splitting of the kd-tree? or would storing just the end-points into kd-suffice for nearest…

newDelete
- 385
- 2
- 10
8
votes
2 answers
How to find set of points in x,y grid using KDTree.query_ball_tree
I am working in python and I have a x,y mesh grid which are numpy arrays. I need to find for each point (x1,y1) in the grid, the points which are present at a distance r from (x1,y1). Scipy has a function KDTree.query_ball_tree which takes as input,…

Numby
- 125
- 1
- 1
- 8