0

I'm running a python script that I've took from a website. It's a simple code that uses Iris dataset and performs a KNN classification on that dataset. But when I run this script I keep getting all of measurement scores as 1.0 which is I believe a wrong result. Where did I make the mistake?

Classification part of the script:

knn = KNeighborsClassifier(n_neighbors=5)

knn.fit(X_train, y_train)


y_pred = knn.predict(X_test)  
print(confusion_matrix(y_test, y_pred))  
print(classification_report(y_test, y_pred))

You can reach the full script from here

user1234
  • 145
  • 5
  • 20
  • could you add the full code ? – Minions Jun 06 '18 at 02:44
  • @Minion sure. I added to the question part – user1234 Jun 06 '18 at 02:52
  • I think there is not problem, check this question: https://stackoverflow.com/questions/36967126/why-do-i-get-good-accuracy-with-iris-dataset-with-a-single-hidden-node – Minions Jun 06 '18 at 03:01
  • @Minion, just asking to understand: even I change the train and test sample I always get 1.0 for everything. Shouldn't that be a wrong result? In the link that you post he get %98 accuracy, which a result that can be understanable. I just found 1.0 very odd – user1234 Jun 06 '18 at 03:06
  • In the link he used a neural network for the classification, I tried to changed the number of neighbors in your example to something high and the result changed to 93 .. – Minions Jun 06 '18 at 03:09
  • 1
    Also, the split that occured in your case (by chance) just happens to give that score. Try changing `random_state` to some other value and you will observe changes in scores. – Vivek Kumar Jun 06 '18 at 04:50

1 Answers1

1

Try another dataset. It will work. Maybe your algorithm works well that classify all the data correctly. I have run this kNN program and had such an issue but for another dataset, it shows the answer other than 1.