3

I try to have a dependency tree from CoNLL input using the NLTK DependencyGraph. What I understood is that this class provides a tree() method that build tree structure for dependency without the relation between head and dependents. The tree has no also the POS Tag. There is also a triple() method that provide the head, the relation and the dependents with POS tag. With the triple method, it is hard for me to get the dependents when when a word is repeated in the sentence like the red car is behind the blue car because the index of the word is not in the triples. Here we have 2 different nodes for the same word car.

So how to get from CoNLL input a dependency tree with the head word, its tags, relation, children. It can also a similar data structure where this information (head word, its tags, relation, children) can be found for a given sentence.Any suggestion is welcome. Below is a code that can be used to start.

from nltk.parse import DependencyGraph


conll_data2 = """1   Cathy             Cathy             N     N     eigen|ev|neut                    2   su      _  _
2   zag               zie               V     V     trans|ovt|1of2of3|ev             0   ROOT    _  _
3   hen               hen               Pron  Pron  per|3|mv|datofacc                2   obj1    _  _
4   wild              wild              Adj   Adj   attr|stell|onverv                5   mod     _  _
5   zwaaien           zwaai             N     N     soort|mv|neut                    2   vc      _  _
6   .                 .                 Punc  Punc  punt                             5   punct   _  _

1   the _   DET DT  _   3   det _   _
2   blue    _   ADJ JJ  _   3   amod    _   _
3   car _   NOUN    NN  _   4   nsubj   _   _
4   is  _   VERB    VBZ _   0   ROOT    _   _
5   behind  _   ADP IN  _   4   prep    _   _
6   the _   DET DT  _   8   det _   _
7   red _   ADJ JJ  _   8   amod    _   _
8   car _   NOUN    NN  _   5   pobj    _   _

1   Ze                ze                Pron  Pron  per|3|evofmv|nom                 2   su      _  _
2   had               heb               V     V     trans|ovt|1of2of3|ev             0   ROOT    _  _
3   met               met               Prep  Prep  voor                             8   mod     _  _
4   haar              haar              Pron  Pron  bez|3|ev|neut|attr               5   det     _  _
5   moeder            moeder            N     N     soort|ev|neut                    3   obj1    _  _
6   kunnen            kan               V     V     hulp|ott|1of2of3|mv              2   vc      _  _
7   gaan              ga                V     V     hulp|inf                         6   vc      _  _
8   winkelen          winkel            V     V     intrans|inf                      11  cnj     _  _
9   ,                 ,                 Punc  Punc  komma                            8   punct   _  _
10  zwemmen           zwem              V     V     intrans|inf                      11  cnj     _  _
11  of                of                Conj  Conj  neven                            7   vc      _  _
12  terrassen         terras            N     N     soort|mv|neut                    11  cnj     _  _
13  .                 .                 Punc  Punc  punt                             12  punct   _  _
"""


graphs = [DependencyGraph(entry)
for entry in conll_data2.split('\n\n') if entry]

for graph in graphs:

 #find data structure here to get head word, its tag, relation, children. 
David
  • 311
  • 1
  • 4
  • 14

0 Answers0