0

I use the Stanford parser for my implementation. I would like to use the tree of a sentence in order to extract various information.

I used the code in : Get certain nodes out of a Parse Tree:

I have my CoreMap sentence and the corresponding tree :

Tree sentenceTree=  sentence.get(TreeCoreAnnotations.TreeAnnotation.class);
for (Tree sentenceTree: t) { 
String pos = sentenceTree.label().value();
String wd = sentenceTree.firstChild().label().value();
Integer wdIndex = ?? 
CoreLabel token = sentence.get(CoreAnnotations.TokensAnnotation.class).get(wdIndex);

}

I was not able to extract the lemma, does anyone have an idea how to do it ?

I tried the following code and it works but it generates some warnings and is not very clean neither:

Annotation a = new Annotation("geese");
ss.pipeline.annotate(a);
CoreMap se = a.get(CoreAnnotations.SentencesAnnotation.class).get(0);
CoreLabel token = se.get(CoreAnnotations.TokensAnnotation.class).get(0);
String lemma = token.get(CoreAnnotations.LemmaAnnotation.class);
System.out.println(lemma); // goose

Has anyone any advice ?

Thank you!

Community
  • 1
  • 1
  • Has the word index in the sentence tree, the same value as the word index in the CoreMap (sentence)? –  Oct 26 '16 at 08:39

1 Answers1

1

I had same problem but I solved it with HashMap of Pairs leaf and index of leaf. This code prints lemmatized version of every matched leaf which is Noun.

        List<CoreLabel> tokens = sentence.get(TokensAnnotation.class);
        Tree tree = sentence.get(TreeAnnotation.class);
        TregexPattern pattern = TregexPattern.compile("NNP | NNS | NN | NNPS");
        TregexMatcher matcher = pattern.matcher(tree);

        HashMap<Tree, Integer> leafDict = new HashMap<>();
        int i = 0;
        for(Tree leaf : tree.getLeaves()) {
            leafDict.put(leaf, i);
            i++;
        }

        while (matcher.find()) {
            int index = leafDict.get( matcher.getMatch().firstChild());
            String result = tokens.get(index).get(LemmaAnnotation.class);
            System.out.println(result);
        }

This solution is working only when searched node is one level before leaf.