1

From online demo Stanford CoreNLP with example sentence "A minimal software item that can be tested in isolation" it gives Collapsed dependencies with CC processed as following:

root ( ROOT-0 , item-4 )
det ( item-4 , A-1 )
amod ( item-4 , minimal-2 )
nn ( item-4 , software-3 )
nsubjpass ( tested-8 , that-5 )
aux ( tested-8 , can-6 )
auxpass ( tested-8 , be-7 )
rcmod ( item-4 , tested-8 )
prep_in ( tested-8 , isolation-10 )

From my Java class I get the same except root(...). The code I am running is as following:

public static void main(String[] args)
    {
        Properties props = new Properties();
        props.put("annotators", "tokenize, ssplit, pos, lemma, ner, parse, dcoref");
        StanfordCoreNLP pipeline = new StanfordCoreNLP(props);

        Annotation document = new Annotation(args[0]);

        pipeline.annotate(document);

        List<CoreMap> sentences = document.get(CoreAnnotations.SentencesAnnotation.class);

        for (CoreMap sentence : sentences) {
            SemanticGraph dependencies = sentence.get(SemanticGraphCoreAnnotations.CollapsedCCProcessedDependenciesAnnotation.class);
            System.out.println(dependencies.toList());
        }
    }

So the question is why my Java code doesnt output root`s!? Am I missing something?

werd
  • 648
  • 4
  • 13
  • 23

1 Answers1

3

This is a good question, in the sense that it exposes a badness in the current code. At present, a root node and an edge from it are not stored in the graph.* Instead, they have to be accessed separately as a root/list of roots of the graph, stored as a separate list. Here are two things that will work: (1) Add this code above the System.out.println:

IndexedWord root = dependencies.getFirstRoot();
System.out.printf("ROOT(root-0, %s-%d)%n", root.word(), root.index());

(2) Use instead of your current line:

System.out.println(dependencies.toString("readable"));

Unlike the other toList() or toString() methods, it does print the root(s).

*There are historical reasons for this: We used to not have any explicit root. But at this point the behavior is awkward and dysfunctional and should be changed. It'll probably happen in a future release.

Christopher Manning
  • 9,360
  • 34
  • 46
  • I managed to find other solution for my case: `GrammaticalStructure gs = gsf.newGrammaticalStructure(tree);` `Collection tdl = gs.typedDependenciesCCprocessed();` – werd May 01 '13 at 22:06
  • Yes, that works well, since the ROOT really is in that collection of dependencies. The minor cost is that you are paying for them to be generated a second time from the parse tree. – Christopher Manning May 02 '13 at 22:43