Find K Closest pairs in a spatial database (Without specific query object)

Question

Input:

• N points {P1, …. , Pn} - each point is from the same dimension t:

Pi = {x_1, …., x_t} where k is between 18-30 .

• Distance function – dist(Pi, Pj) - returns a number that is the distance between the points. (The function is a custom function – not a standard Minkowski distance).

The Problem:

• Main problem:

Find the K closest pairs from all the N points – as fast as possible.

• Secondary problem:

Given a point Q = {x_1, …, x_t} return the K closest pairs.

• Nice to Have:

A Database where we can add/remove a point Pi and the above queries will run as fast as possible.

Relevant Data structures:

• KD-TREE

• R-TREE

• BALL-TREE

Possible Solutions:

• Main problem:

Build a BallTree (sklearn.neighbors.BallTree).
For every point P in the BallTree find K closest pairs (Now we have N List- where every list contains the K closest pairs for every Point Pi).
Take the best K pairs from all the list above.

• Secondary problem:

Build a BallTree (sklearn.neighbors.BallTree).
Query for closest k pairs for a given point Q.

Time Complexity So far:

For every point in the tree (N in total) find K closest pairs which take O(K*log(N)) - So in total O(N * K * log(N)).
Take the best K pairs from N sorted list - can take O( Max{ K * log(K), N } ). for example with maintaining a min HEAP of size K.

The total complexity, for now, is O(N * K * log(N)) - can we do better?

@Dominik As I mentioned I'm trying to solve the Main problem as fast as possible. I added a time complexity section, does it helps? — Mike M, Aug 01 '18 at 06:48
It sounds more like a Maths problem than a Programming Problem, you may want to ask that on the Maths Stackoverflow Site. — Dominik, Aug 01 '18 at 09:45
@Dominik I'll ask there as well, but my intention is to find a faster solution for the above problem that really works - that is why I attached the current data structure I'm using (sklearn.neighbors.BallTree). So, the section of the time analysis is just for intuition - I'm just looking for a solution which will run faster. — Mike M, Aug 01 '18 at 11:18

Find K Closest pairs in a spatial database (Without specific query object)

0 Answers0