0

I am using sklearn.lda for a classification purpose and was a little puzzled about the score function that prints the mean classification error. Is it determined by leave one out - jackknife? How do I interpret the result? It's only a float value without much documentation.

Thanks in advance, EL

Fred Foo
  • 355,277
  • 75
  • 744
  • 836
El Dude
  • 5,328
  • 11
  • 54
  • 101

1 Answers1

1

The score method takes samples X and their true labels y and compares its own predictions with y. It returns the mean accuracy, which is always a single figure. For example,

lda = LDA().fit(X, y)
print(lda.score(X, y))

will print the accuracy of the classifier on its own training set.

Every classifier has a score method, which usually (though not necessarily) returns mean accuracy. The method is used by the GridSearchCV model selection algorithm to determine the quality of the classifier if you don't explicitly give it a scoring argument.

Fred Foo
  • 355,277
  • 75
  • 744
  • 836
  • Thanks for the swift answer! My question now is: how do I interpret the output value? I just got 0.8 on a test set. Thanks for the classname anyway, makes me understand scikit little better. – El Dude Jun 21 '13 at 21:05
  • 2
    @ElDude: that means you have 80% accuracy on your test set. Whether that's good depends on the problem. – Fred Foo Jun 21 '13 at 21:15
  • cool. Thanks. One last question: which method provides the parameters of the discriminant function? get_params gives me this:{'priors': None, 'n_components': None} – El Dude Jun 21 '13 at 21:55
  • @ElDude: there's no such method, scikit-learn stores model parameters as public attributes. Check the docstring, `help(LDA)`, under Attributes. – Fred Foo Jun 21 '13 at 22:27