i don't know much about python but for my class i must make a KNN classification where i need to get all the neighbors (not just the predicted one by majority vote) and get percentage of each of the neighbors (percentage of chance of occuring). for example my train data looks like:
zip lat long
77339 73730.689 -990323.6834
77339 73731.699 -990323.6834
77345 71679.54137 -998244.7071
77346 71679.54137 -998245.7071
and my test data is:
77388 55410.72694 -994507.7348
77389 60816.86756 -990211.3075
77396 68641.10762 -997071.3902
77380 48762.26134 -978612.6912
where zip is the class needed to predict. i made two file for train and test set. my goal is to get the neighbor list for each of the test instances. so, if the majority vote says the predict zip for zip 77388 in testset is 77380 (k=3),then i wanna know what is the percentage for occuring 77380 as well as of other neighbors for that zip (other 2 neighbors might be 77388, 77389).
i tried the following code with help of this link http://alexhwoods.com/k-nearest-neighbors-in-scikit-learn/ .
import pandas as pd
import numpy as np
train =pd.read_csv('C:\\train.csv')
test =pd.read_csv('C:\\test.csv')
from sklearn.neighbors import KNeighborsClassifier
cols=['lat','long']
cols2=['tweetzip']
trainArr=train.as_matrix(cols) #trainArr= has training numeric feature (location)
trainRes=train.as_matrix(cols2) # trainRes= contains the correct zip for training data
testArr=test.as_matrix(cols) ##testArr= has location of test data
testRes=test.as_matrix(cols2) # ##testRes= has correct tweetzip for test tada
knn =KNeighborsClassifier(n_neighbors=3,weights='uniform')
knn.fit(trainArr,trainRes.ravel())
output=knn.predict(testArr)
knn.kneighbors(testArr, return_distance=False) ### is it the same as previous line?
tweezip=[]
predictzip=[]
correct=0.0
for i in range(len(output)):
if testRes[i][0]==output[i]:
correct+=1
tweezip.append(testRes[i][0])
predictzip.append(output[i])
print tweezip
print predictzip
print 'correct',correct/len(output)
i tried to find something but cannot find the right one that will be helpful for me. Thank you.