I wrote a simple Naive Bayes word classifier. In simple term what it does is
...
train( "some text A ...", "categoryA" );
train( "some text A ...", "categoryA" );
train( "some text B ...", "categoryB" );
train( "some text B ...", "categoryB" );
...
myclass category = GetCategory( "some new text" );
EXPECT_EQ( "categoryA"|"categoryB", category.Id);
EXPECT_EQ( xyz%, category.Percent);
...
While this will work in practice I was wondering if there was another, better, way of unit testing the classification of the document.
Would 3, 4 or ... categories make the test more reliable?
What would be a good suit of tests to test my function?