Questions tagged [knn]

In pattern recognition, k-nearest neighbors (k-NN) is a classification algorithm used to classify example based on a set of already classified examples. Algorithm: A case is classified by a majority vote of its neighbors, with the case being assigned to the class most common amongst its K nearest neighbors measured by a distance function. If K = 1, then the case is simply assigned to the class of its nearest neighbor.

The idea of the k-nearest neighbors (k-NN) algorithm is using the features of an example - which are known, to determine the classification of it - which is unknown.

First, some classified samples are supplied to the algorithm. When a new non-classified sample is given, the algorithm finds the k-nearest neighbors to the new sample, and determines what should its classification be, according to the classification of the classified samples, which were given as training set.

The algorithm is sometimes called lazy classification because during "learning" it does nothing - just stores the samples, and all the work is done during classification.

Algorithm

The k-NN algorithm is among the simplest of all machine learning algorithms. A shortcoming of the k-NN algorithm is that it is sensitive to the local structure of the data.

The training examples are vectors in a multidimensional feature space, each with a class label. The training phase of the algorithm consists only of storing the feature vectors and class labels of the training samples.

In the classification phase, k is a user-defined constant, and an unlabeled vector (a query or test point) is classified by assigning the label which is most frequent among the k training samples nearest to that query point.

A commonly used distance metric for continuous variables is Euclidean distance. For discrete variables, such as for text classification, another metric can be used, such as the overlap metric (or Hamming distance).

Often, the classification accuracy of k-NN can be improved significantly if the distance metric is learned with specialized algorithms such as Large Margin Nearest Neighbor or Neighbourhood components analysis.

Useful links

1775 questions
0
votes
1 answer

R: how do you calculate prediction accuracy for KNN?

library(caret) irisFit1 <- knn3(Species ~ ., iris) irisFit2 <- knn3(as.matrix(iris[, -5]), iris[,5]) data(iris3) train <- rbind(iris3[1:25,,1], iris3[1:25,,2], iris3[1:25,,3]) test <- rbind(iris3[26:50,,1], iris3[26:50,,2], iris3[26:50,,3]) cl <-…
Adrian
  • 9,229
  • 24
  • 74
  • 132
0
votes
2 answers

MGLEARN Plot KNN classification doesn't show the plot

I am moving the first ML steps reading the book Introduction to Machine Learning. I am trying to generate the picture of the snippet "In [10]" that you can find on this page, but it down't work. When I say that it doesn't work I mean that nothing…
Nicolaesse
  • 2,554
  • 12
  • 46
  • 71
0
votes
1 answer

Anova test regression vs. knn in R

I'm trying to take an anova test for two different models in R: a lm model vs. a knn model. The problem is this error appears: Error in anova.lmlist(object, ...) : models were not all fitted to the same size of dataset I think this make sense…
Carlos
  • 889
  • 3
  • 12
  • 34
0
votes
1 answer

how to use the PCs (resulting from PCA) on my dataset in R?

I am an R learner. I am working on 'Human Activity Recognition' dataset from internet. It has 563 variables, the last variable being the class variable 'Activity' which has to be predicted. I am trying to use KNN algorithm here from CARET package of…
0
votes
1 answer

kNN in R using two parameters

Is it possible to train the data in relation to both the trainData$sp and trainData$sex ? library(dplyr) library(caret) library(e1071) data(crabs, package = "MASS") crabs = mutate_if(crabs, is.character, as.factor) set.seed(1234) index <-…
Lymmuar
  • 1
  • 1
0
votes
1 answer

how to return index of nearest neighbor in knngow

I want to use knngow in the dprep package. And, in addition to returning the appropriate label for the test data, I also want to return the row index to the nearest neighbor(in train data). Is there any function in this package for this job?My data…
maria
  • 45
  • 6
0
votes
1 answer

Error in KNN function to make predictions

I am trying to predict values for a categorical variable using a KNN model in R. To do this, I am using a function so that I can easily vary the dataset, % of observations, and k-value. When I apply this function to a particular dataset though, I…
zsad512
  • 861
  • 3
  • 15
  • 41
0
votes
1 answer

return the value of the nearest neighbor

I want to use the knn method for classification, but in addition to retrieving the appropriate label, I also need to retrieve the values of the nearest neighbor (to the test data). How can I retrieve the nearest neighbor in 1nn? For example, I…
maria
  • 45
  • 6
0
votes
1 answer

Error when training k-NN using SURF Features

I'm trying to retrieve a set of similar images based on an input image. I'm using setting an array element with a sequence. setting an array element with a sequence. OpenCV for Python by the way. My strategy is that I get the SURF features of the…
Jessie
  • 31
  • 5
0
votes
0 answers

Issue in kNN regression using R?

I am trying to do prediction using kNN regression in R. I have two variable (X,Y) in excel table format (total 800 data-sets in each variable). My aim is to predict the value of Y (present in Test table) So for that I have written code in R as…
Ankita
  • 485
  • 5
  • 18
0
votes
2 answers

What would be predicted class of KNN for an even value of K and in case of a tie?

In KNN (K nearest neighbour) classifiers, if an even value of K is chosen, then what would be the prediction in majority voting rule or in Euclidean distance rule. For example if there are 3 classes…
Mayukh Sarkar
  • 2,289
  • 1
  • 14
  • 38
0
votes
0 answers

Training set for alphabetic characters

I am training a kNN model that will recognize alphabetic characters in the range [A-Z0-9]. Most of the training sets that I found online only contain digits [0-9]. Can someone tell me where I can found such training set ([A-Z0-9])?
Patrick
  • 1,091
  • 1
  • 14
  • 14
0
votes
1 answer

How to find the best value of k For the k-NN?

I have 4 different datasets and Each dataset contains two-dimensional samples that belong to one of the two classes: 1 or 2. The class labels (1 or 2) for each sample are located in the last column. The first and second columns contain the…
0
votes
0 answers

KNN Classification Test Accuracy Explanation

I have run a K-Nearest Neighbor on some sleep data I am using for a project. I was able to run the model to about a 45% accuracy. What I am struggling to understand is that I graphed out the accuracy of my model based on the K number. After K > 36…
DEB
  • 241
  • 1
  • 2
  • 7
0
votes
1 answer

K nearest neighbors in Mathcad - what functions I can use?

I found the k nearest neighbours Mathcad implementation in M. Nixon's book but 'distance' and 'coord' functions don't work. How can I fix it? mathcad implementation
Anastasiia
  • 21
  • 4