Questions tagged [knn]

In pattern recognition, k-nearest neighbors (k-NN) is a classification algorithm used to classify example based on a set of already classified examples. Algorithm: A case is classified by a majority vote of its neighbors, with the case being assigned to the class most common amongst its K nearest neighbors measured by a distance function. If K = 1, then the case is simply assigned to the class of its nearest neighbor.

The idea of the k-nearest neighbors (k-NN) algorithm is using the features of an example - which are known, to determine the classification of it - which is unknown.

First, some classified samples are supplied to the algorithm. When a new non-classified sample is given, the algorithm finds the k-nearest neighbors to the new sample, and determines what should its classification be, according to the classification of the classified samples, which were given as training set.

The algorithm is sometimes called lazy classification because during "learning" it does nothing - just stores the samples, and all the work is done during classification.

Algorithm

The k-NN algorithm is among the simplest of all machine learning algorithms. A shortcoming of the k-NN algorithm is that it is sensitive to the local structure of the data.

The training examples are vectors in a multidimensional feature space, each with a class label. The training phase of the algorithm consists only of storing the feature vectors and class labels of the training samples.

In the classification phase, k is a user-defined constant, and an unlabeled vector (a query or test point) is classified by assigning the label which is most frequent among the k training samples nearest to that query point.

A commonly used distance metric for continuous variables is Euclidean distance. For discrete variables, such as for text classification, another metric can be used, such as the overlap metric (or Hamming distance).

Often, the classification accuracy of k-NN can be improved significantly if the distance metric is learned with specialized algorithms such as Large Margin Nearest Neighbor or Neighbourhood components analysis.

Useful links

1775 questions
0
votes
0 answers

R knncat Error in 1:knots.vec[num.ctr]

Apologies if this is elsewhere (and if my question is done poorly - this is my first post). I have searched for days and solved all my other errors, but I keep getting this one: "Error in 1:knots.vec[num.ctr] : NA/NaN argument". I am trying to…
JHawkins
  • 243
  • 1
  • 2
  • 10
0
votes
1 answer

I tried to train eigenfaces using knn it trained but when test with new image it give me the error in findnearest function

I tried to train eigenfaces using knn. It trained but when I test it with new an image it gives me the error in findnearest function cv2.error: D:\Build\OpenCV\opencv-3.2.0\modules\ml\src\knearest.cpp:325: error: (-215) test_samples.type() ==…
Krishna Kamal
  • 751
  • 2
  • 10
  • 21
0
votes
0 answers

Octave Forge knnsearch

I just got Octave Forge the other day, and am trying to run my MATLAB code. There's a line in my code that's giving me this error warning: the 'knnsearch' function belongs to the statistics package from Octave Forge but has not yet been…
user1153070
  • 139
  • 8
0
votes
2 answers

alternate efficient way to compute distance instead of eucledian distance in knn algorithm

I have implemented knn algorithm and this is my function to calculate the Euclidian distance. def euc_dist(self, train, test): return math.sqrt(((train[0] - test[0]) ** 2) + ((test[1] - train[1]) ** 2)) # def euc_distance(self, test): …
nirvair
  • 4,001
  • 10
  • 51
  • 85
0
votes
1 answer

how to get list all the neighbors from KNN for each instance in python

i don't know much about python but for my class i must make a KNN classification where i need to get all the neighbors (not just the predicted one by majority vote) and get percentage of each of the neighbors (percentage of chance of occuring). for…
khan
  • 31
  • 4
0
votes
0 answers

Getting Error "k = 0 must be at least 1" when i give k a value in kNN algorithm (R)

When ever I put this line of code: > m1 <- knn( train = trainSetNorm[,c(1:41)], test = testSetNorm[,c(1:41)], cl = trainSetNorm[,c(42)], k = 703) I get the following error: Error in knn(train = trainSetNorm[, c(1:41)], test = testSetNorm[,…
0
votes
1 answer

KNN Algorithm from scratch python

I am trying to execute a KNN algorithm from scratch, but I am getting a really strange error saying "KeyError: 0" I assume this implying I have an empty dictionary somewhere, but I don't understand how that can be. I might just add for the sake of…
Ali
  • 837
  • 2
  • 12
  • 18
0
votes
3 answers

What parameters to optimize in KNN?

I want to optimize KNN. There is a lot about SVM, RF and XGboost; but very few for KNN. As far as I know the number of neighbors is one parameter to tune. But what other parameters to test? Is there any good article? Thank you
Aizzaac
  • 3,146
  • 8
  • 29
  • 61
0
votes
1 answer

KNN doesn't classify correctly

I'm building my own 1-NN clasifier in python cause I need max speed in certain operations for testing it, because I want to use it in genetic algorithm and every milisecond its important for speed. I'm trying to implement a leave one out test inside…
rafaelleru
  • 367
  • 1
  • 2
  • 19
0
votes
3 answers

Permutations and combinations of all the columns in R

I want to check all the permutations and combinations of columns while selecting models in R. I have 8 columns in my data set and the below piece of code lets me check some of the models, but not all. Models like column 1+6, 1+2+5 will not be…
Vipin Verma
  • 5,330
  • 11
  • 50
  • 92
0
votes
1 answer

Plotting a KNN classifier graph for multiple features

I am trying to plot a KNN graph for my data but i keep getting this error which i cant figure out. clf = neighbors.KNeighborsClassifier(k, weights=weights) AttributeError: 'list' object has no attribute 'KNeighborsClassifier' Below i have attached…
Thom Elliott
  • 197
  • 3
  • 6
  • 18
0
votes
1 answer

Apache Spark - java.lang.NoSuchMethodError: breeze.linalg.Vector$.scalarOf()Lbreeze/linalg/support/ScalarOf

Here is the error: Exception in thread "main" java.lang.NoSuchMethodError: breeze.linalg.Vector$.scalarOf()Lbreeze/linalg/support/ScalarOf; at org.apache.spark.ml.knn.Leaf$$anonfun$4.apply(MetricTree.scala:95) at…
0
votes
0 answers

How to create a PCA for KNN classification?

I have been following this tutorial for the iris flower dataset. I am trying to create a 2D pca for my own data but i cant figure out what to change etc. This is my code: data_df = pd.DataFrame.from_csv("fvectors.csv") data_df =…
Thom Elliott
  • 197
  • 3
  • 6
  • 18
0
votes
1 answer

Sending large amount of data from one system to another

Actually I am trying to send trained data from system 1 to system 2, so that I can do KNN classification in system 2. But I find difficult to sent the trained data as it is very large. Is there any way to send bulky data from one system to another…
Neenu
  • 479
  • 1
  • 6
  • 15
0
votes
1 answer

R Does knncat normalise data?

I am performing a knn analysis of some data. I have both categorical (with more than 2 factors) and continuous data. I found a package that accounts for this situation (knncat) but there's very little documentation explaining how it actually works.…
BlueRhapsody
  • 93
  • 2
  • 13