Questions tagged [knn]

In pattern recognition, k-nearest neighbors (k-NN) is a classification algorithm used to classify example based on a set of already classified examples. Algorithm: A case is classified by a majority vote of its neighbors, with the case being assigned to the class most common amongst its K nearest neighbors measured by a distance function. If K = 1, then the case is simply assigned to the class of its nearest neighbor.

The idea of the k-nearest neighbors (k-NN) algorithm is using the features of an example - which are known, to determine the classification of it - which is unknown.

First, some classified samples are supplied to the algorithm. When a new non-classified sample is given, the algorithm finds the k-nearest neighbors to the new sample, and determines what should its classification be, according to the classification of the classified samples, which were given as training set.

The algorithm is sometimes called lazy classification because during "learning" it does nothing - just stores the samples, and all the work is done during classification.

Algorithm

The k-NN algorithm is among the simplest of all machine learning algorithms. A shortcoming of the k-NN algorithm is that it is sensitive to the local structure of the data.

The training examples are vectors in a multidimensional feature space, each with a class label. The training phase of the algorithm consists only of storing the feature vectors and class labels of the training samples.

In the classification phase, k is a user-defined constant, and an unlabeled vector (a query or test point) is classified by assigning the label which is most frequent among the k training samples nearest to that query point.

A commonly used distance metric for continuous variables is Euclidean distance. For discrete variables, such as for text classification, another metric can be used, such as the overlap metric (or Hamming distance).

Often, the classification accuracy of k-NN can be improved significantly if the distance metric is learned with specialized algorithms such as Large Margin Nearest Neighbor or Neighbourhood components analysis.

Useful links

1775 questions

votes

3 answers

How to use both binary and continuous features in the k-Nearest-Neighbor algorithm?

My feature vector has both continuous (or widely ranging) and binary components. If I simply use Euclidean distance, the continuous components will have a much greater impact: Representing symmetric vs. asymmetric as 0 and 1 and some less important…

algorithm machine-learning knn

asked Nov 30 '10 at 14:38

John Hall

votes

1 answer

Predict with sklearn-KNN using median (instead of mean)

Sklearn-KNN allows one to set weights (e.g., uniform, distance) when calculating the mean x nearest neighbours. Instead of predicting with the mean, is it possible to predict with the median (perhaps with a user-defined function)?

python scikit-learn knn

asked Nov 15 '15 at 04:25

Eugene Yan

votes

3 answers

How to convert distance into probability?

Сan anyone shine a light to my matlab program? I have data from two sensors and i'm doing a kNN classification for each of them separately. In both cases training set looks like a set of vectors of 42 rows total, like this: [44 12 53 29 35 30 49; …

matlab classification knn euclidean-distance probability-density

asked May 04 '14 at 18:04

niko_dry

votes

3 answers

how to measure the accuracy of knn classifier in python

I have used knn to classify my dataset. But I do not know how to measure the accuracy of the trained classifier. Does scikit have any inbuilt function to check accuracy of knn classifier? from sklearn.neighbors import KNeighborsClassifier knn =…

python python-2.7 machine-learning scikit-learn knn

asked Apr 04 '13 at 20:30

user1946217

1,733
6
31
40

votes

2 answers

kNN: training, testing, and validation

I am extracting image features from 10 classes with 1000 images each. Since there are 50 features that I can extract, I am thinking of finding the best feature combination to use here. Training, validation and test sets are divided as…

machine-learning computational-geometry classification knn

asked May 30 '12 at 10:45

klijo

15,761
8
34
49

votes

3 answers

Faster kNN Classification Algorithm in Python

I want to code my own kNN algorithm from scratch, the reason is that I need to weight the features. The problem is that my program is still really slow despite removing for loops and using built in numpy functionality. Can anyone suggest a way to…

python machine-learning scikit-learn knn

asked Aug 04 '18 at 18:39

Eoin Ó Coinnigh

votes

1 answer

Tuning leaf_size to decrease time consumption in Scikit-Learn KNN

I was trying to implement KNN for handwritten character recognition where I found out that the execution of code was taking a lot of time. When added parameter leaf_size with value 400, I observed that time taken by code to execute was significantly…

machine-learning scikit-learn knn sklearn-pandas kdtree

asked Apr 21 '18 at 08:49

Harshit Saini

votes

2 answers

Grid Search parameter and cross-validated data set in KNN classifier in Scikit-learn

I'm trying to perform my first KNN Classifier using SciKit-Learn. I've been following the User Guide and other online examples but there are a few things I am unsure about. For this post lets use the following X = data Y = target In most…

scikit-learn cross-validation knn grid-search

asked Nov 16 '16 at 14:31

browser

votes

4 answers

Is there any function to calculate Precision and Recall using Matlab?

I have problem about calculating the precision and recall for classifier in matlab. I use fisherIris data (that consists of 150 datapoints, 50-setosa, 50-versicolor, 50-virginica). I have classified using kNN algorithm. Here is my confusion…

matlab knn confusion-matrix precision-recall

asked Apr 07 '14 at 14:14

user19565

votes

2 answers

Is k-d tree efficient for kNN search. k nearest neighbors search

I have to implement k nearest neighbors search for 10 dimensional data in kd-tree. But problem is that my algorithm is very fast for k=1, but as much as 2000x slower for k>1 (k=2,5,10,20,100) Is this normal for kd trees, or am I doing something…

search nearest-neighbor kdtree knn

asked Jan 09 '10 at 17:24

Andraz

votes

3 answers

K-nearest neighbour C/C++ implementation

Where can I find an serial C/C++ implementation of the k-nearest neighbour algorithm? Do you know of any library that has this? I have found openCV but the implementation is already parallel. I want to start from a serial implementation and…

c++ c parallel-processing nearest-neighbor knn

asked Nov 21 '12 at 07:45

alexsardan

votes

1 answer

Problems with k-NN regression in R

I am trying to run knnreg from the package caret. For some reason, this training set works: > summary(train1) V1 V2 V3 13 : 10474 1 : 6435 7 : 8929 10 : 10315 2 : …

r regression knn r-caret

asked Mar 20 '14 at 10:15

thecheech

2,041
3
18
25

votes

1 answer

Why is KNN much faster than decision tree?

Once in an interview, I encountered a question from the employer. He asked me why KNN classifier is much faster than decision tree for example in letter recognition or in face recognition? I had completely no idea at that time. So I want to know…

algorithm machine-learning decision-tree knn

asked Mar 15 '13 at 09:01

zfz

1,597
1
22
45

votes

7 answers

K Nearest Neighbour Algorithm doubt

I am new to Artificial Intelligence. I understand K nearest neighbour algorithm and how to implement it. However, how do you calculate the distance or weight of things that aren't on a scale? For example, distance of age can be easily calculated,…

algorithm artificial-intelligence knn

asked Mar 29 '09 at 17:09

wai

8,923
4
24
19

votes

1 answer

Euclidean distance, different results between Scipy, pure Python, and Java

I was playing around with different implementations of the Euclidean distance metric and I noticed that I get different results for Scipy, pure Python, and Java. Here's how I compute the distance using Scipy (= option 1): distance =…

java python scipy knn euclidean-distance

asked Feb 28 '18 at 12:49

Silas Berger

Prev 1 2

…

99 100 Next