How to use StanfordNLP Chinese segmentor in Java?

Question

I have tried the following code, however the code does not work and only outputs null.

String text = "我爱北京天安门。";
StanfordCoreNLP pipeline = new StanfordCoreNLP();
Annotation annotation = pipeline.process(text);
String result = annotation.get(CoreAnnotations.ChineseSegAnnotation.class);
System.out.println(result);

The result:

...
done [0.6 sec].
Using mention detector type: rule
null

How to use StanfordNLP Chinese segmentor correctly?

StanfordNLPHelp · Answer 1 · 2016-07-19T12:16:44.060

Some sample code:

import edu.stanford.nlp.pipeline.*;
import edu.stanford.nlp.ling.CoreAnnotations;
import edu.stanford.nlp.ling.CoreLabel;
import edu.stanford.nlp.util.StringUtils;

import java.util.*;

public class ChineseSegmenter {

    public static void main (String[] args) {
        // set the properties to the standard Chinese pipeline properties
        Properties props = StringUtils.argsToProperties("-props", "StanfordCoreNLP-chinese.properties");
        StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
        String text = "...";
        Annotation annotation = new Annotation(text);
        pipeline.annotate(annotation);
        List<CoreLabel> tokens = annotation.get(CoreAnnotations.TokensAnnotation.class);
        for (CoreLabel token : tokens)
            System.out.println(token);
    }
}

Note: Make sure the Chinese models jar is on your CLASSPATH. That file is available here: http://stanfordnlp.github.io/CoreNLP/download.html

The above code should print out the tokens created after the Chinese segmenter is run.

How to use StanfordNLP Chinese segmentor in Java?

1 Answers1