0

I used to find nearest neighbours in data array X with sklearn.neighbors.NearestNeighbors like this:

from sklearn.neighbors import NearestNeighbors
import numpy as np
nn = NearestNeighbors(n_neighbors=2).fit(X)
distances, indices = nn.nkneighbors(X)

In this case, I could get two closest neighbours in X to each row in the same X, that's straightforward. However, I'm confused about applying fitted nn.nkneighbors(X) to another array Y:

distances, indices = nn.nkneighbors(Y)

This gives 2 neighbours to each row in Y, but how to interpret this? Are the found neighbours from X or Y? And how to interpret kneighbors([X, n_neighbors, return_distance]) in Sklearn's documentation page, what is n_neighbors here?

Fredrik
  • 411
  • 1
  • 3
  • 14
  • What do you expect it to do? e.i. Are you familiar with knearest algorithms and machine learning in general? NearestNeighbors is just an unsupevised (classification) model, and its output on an different array than it was fit on would be interpreted in the same way as any other unsupervised model. I will answer your question: the found neigbours are from X, but as getting neigbours from Y would violate all principles of fitting a unsuporvised model, I suggest you read up on the basics of machine learning first. Also, the `n_neigbors` function parameter simply overwrites the old hyperparamter. – Egeau May 23 '23 at 09:43

1 Answers1

0

NearestNeighbors implements unsupervised nearest neighbors learning.

What is the main difference between unsupervised and supervised learning: enter image description here

That means that you don't have a "target", you just try to find the neighbors.

Here is a further explanation: What is the difference between supervised learning and unsupervised learning?

So for a "new" dataset he will identify the nearest neighbors again. In this case he will give you the nearest neighbors from Y based on the fit of X.

PV8
  • 5,799
  • 7
  • 43
  • 87