Thanks to this great answer I got a good start training my own NE chunker for Dutch, using NLTK and the Conll2002 corpus: NLTK named entity recognition in dutch. Using these hints I was also able to easily train an improved tagger (bases on IIS classification) that tags at around 95% accuracy, which is enough for my purposes.
However, the F-measure of the named entity recognition is only around 40%. How can I improve this? I tried using built in algorithms like Maxent, but I only get a memory error. Then I moved on to try and get Megam to work, but it won't compile on my Windows machine and there is no binary available anymore. I also ran into dead ends trying to incorporate other software or methods like libSVM, YamCha, CRF++ and Weka. All have their own manual and problems, which seem to keep stacking up. So I'm feeling a bit overwhelmed.
What I need is a practical approach to NER for Dutch. There has been a lot of research and I found papers quoting F-measures between 70% and 85%. That would be great! Does anyone have a hint as to where I could find an improved implementation or how I could build one myself (using Windows)? I would prefer to use NLTK for it's flexibility, but if there is a standard solution in a different toolkit I'm game for that, too. Even commercial tools would be welcome.
Here is the code I use for the evaluation now:
import nltk
from nltk.corpus import conll2002
tokenizer = nltk.data.load('tokenizers/punkt/dutch.pickle')
tagger = nltk.data.load('taggers/conll2002_ned_IIS.pickle')
chunker = nltk.data.load('chunkers/conll2002_ned_NaiveBayes.pickle')
test_sents = conll2002.tagged_sents(fileids="ned.testb")[0:1000]
print "tagger accuracy on test-set: " + str(tagger.evaluate(test_sents))
test_sents = conll2002.chunked_sents(fileids="ned.testb")[0:1000]
print chunker.evaluate(test_sents)
# chunker trained with following commandline:
# python train_chunker.py conll2002 --fileids ned.train --classifier NaiveBayes --filename /nltk_data/chunkers/conll2002_ned_NaiveBayes.pickle