I am just playing around with Part-of-speech Tagging, and started using OpenNLP.
I am using the following code to load the model (Java):
m_modelFile = new FileInputStream("c:\\DATA\\en-parser-chunking.bin");
m_model = new ParserModel(m_modelFile);
m_parser = ParserFactory.create(m_model);
...
Parse topParses[] = ParserTool.parseLine(sentence, m_parser, 1);
I am noticing that the call to create the ParserModel object is insanely slow. Could be b/c en-parser-chunking.bin is 35MB in size. Is there a better way to use this so that it's not this slow? Alternatively, is there a POS tagger you recommend or a way of calling the API that's faster?
I've been playing around with the accuracy, and it's pretty good. But, I am not happy with the performance when loading the model...
Thanks guys.