I am trying to get my ways around with nltk-trainer (https://github.com/japerk/nltk-trainer). I managed to train Dutch taggers and chunkers with the commands (directly in Anaconda console):
python train_tagger.py conll2002 --fileids ned.train --classifier IIS --filename ~/nltk_data/taggers/conll2002_ned_IIS.pickle
python train_chunker.py conll2002 --fileids ned.train --classifier NaiveBayes --filename ~/nltk_data/chunkers/conll2002_ned_NaiveBayes.pickle
Then I run a little script to test the tagger and the chunker:
import nltk
from nltk.corpus import conll2002
# Loading training pickles
tokenizer = nltk.data.load('tokenizers/punkt/dutch.pickle')
tagger = nltk.data.load('taggers/conll2002_ned_IIS.pickle')
chunker = nltk.data.load('chunkers/conll2002_ned_NaiveBayes.pickle')
# Testing
test_sents = conll2002.tagged_sents(fileids="ned.testb")[0:1000]
print "tagger accuracy on test-set: " + str(tagger.evaluate(test_sents))
test_sents = conll2002.chunked_sents(fileids="ned.testb")[0:1000]
print "tagger accuracy on test-set: " + str(chunker.evaluate(test_sents))
This works fine from the nltk-trainer-master folder, but when I move the script elsewhere, I get an import error:
ImportError: No module named nltk_trainer.chunking.chunkers
How can I make this work outside the nltk-trainer-master folder, without copying the nltk_trainer folder?
(Python 2.7, nltk 3.2.1)