1

I have a problem with understanding how KFold Cross-Validation works in the new model selection version. I am using Naive Bayes classifier and I would like to test it using cross-validation. My test and train data are split like this:

test_set = posRevBag[:200] + negRevBag[:200] 
train_set = posRevBag[200:] + negRevBag[200:]

and they are represented like ({'one': True, 'last': True...},pos).

I know that in the old cross-validation I would have had something like:

cv = cross_validation.KFold(len(train_set), n_folds=10, indices=True, shuffle=False, random_state=None, k=None)

for traincv, testcv in cv:
    classifier = nltk.NaiveBayesClassifier.train(train_set[traincv[0]:traincv[len(traincv)-1]])
    print 'accuracy:', nltk.classify.util.accuracy(classifier, train_set [testcv[0]:testcv[len(testcv)-1]])

For the new cross-validation I saw that it doesn't take the length of the training set anymore and also it uses a split function which I'm not quite familiar with since I split my test and train set manually as seen above.

Harshil
  • 914
  • 1
  • 12
  • 26
Simm
  • 89
  • 1
  • 10

0 Answers0