1

I have about 4 million texts to annotate with the Stanford POS tagger. How can I disable these logging messages:

Reading POS tagger model from edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger ... done [1,1 sec].

I don't need 4 million of these in my log files.

Pete
  • 502
  • 1
  • 6
  • 20

2 Answers2

1

StanfordNLP uses Redwood as logging framework for logging. You have to disable it before initializing StanfordNLP pipeline.

import edu.stanford.nlp.util.logging.RedwoodConfiguration;
RedwoodConfiguration.current().clear().apply();
StanfordCoreNLP pipeline = new StanfordCoreNLP(props);

It works for me. It does not show lengthy INFO message in every line, while running program.

Reference: RedwoodConfiguratin Tutorial.

Hope it helps!

Om Prakash
  • 2,675
  • 4
  • 29
  • 50
0

Could you provide me with more details on how you're using Stanford CoreNLP? It looks like you're loading the POS tagger for each document which you don't have to do. So you could load the POS tagger once (per worker if you have a cluster) and then go through the documents re-using the already loaded tagger. This will also speed up your processing!

StanfordNLPHelp
  • 8,699
  • 1
  • 11
  • 9