2

I am using Stanford dependency parser and the I get the following output of the sentence

I shot an elephant in my sleep

python dep_parsing.py 
[((u'shot', u'VBD'), u'nsubj', (u'I', u'PRP')), ((u'shot', u'VBD'), u'dobj', (u'elephant', u'NN')), ((u'elephant', u'NN'), u'det', (u'an', u'DT')), ((u'shot', u'VBD'), u'nmod', (u'sleep', u'NN')), ((u'sleep', u'NN'), u'case', (u'in', u'IN')), ((u'sleep', u'NN'), u'nmod:poss', (u'my', u'PRP$'))]

I want to convert this into a graph with nodes being each token and edges being the relation between them.

I need the graph structure for further processing hence it would help if modification to it are easy and also must be easily representable.

Here is my code till now.

from nltk.parse.stanford import StanfordDependencyParser
stanford_parser_dir = 'stanford-parser/'
eng_model_path = stanford_parser_dir  + "stanford-parser-models/edu/stanford/nlp/models/lexparser/englishRNN.ser.gz"
my_path_to_models_jar = stanford_parser_dir  + "stanford-parser-3.5.2-models.jar"
my_path_to_jar = stanford_parser_dir  + "stanford-parser.jar"

dependency_parser = StanfordDependencyParser(path_to_jar=my_path_to_jar, path_to_models_jar=my_path_to_models_jar)

result = dependency_parser.raw_parse('I shot an elephant in my sleep')
dep = result.next()
a = list(dep.triples())
print a

How can I make such a graph structure?

Ozgur Vatansever
  • 49,246
  • 17
  • 84
  • 119
Riken Shah
  • 3,022
  • 5
  • 29
  • 56

1 Answers1

4

You can traverse over dep.triples() and get your desired output.

Code:

for triple in dep.triples():
    print triple[1],"(",triple[0][0],", ",triple[2][0],")"

Output:

nsubj ( shot ,  I )
dobj ( shot ,  elephant )
det ( elephant ,  an )
nmod ( shot ,  sleep )
case ( sleep ,  in )
nmod:poss ( sleep ,  my )

For more information you can check : NLTK Dependencygraph methods triples(), to_dot() and dep.tree().draw()

Edit -

The output of dep.to_dot() is

digraph G{
edge [dir=forward]
node [shape=plaintext]

0 [label="0 (None)"]
0 -> 2 [label="root"]
1 [label="1 (I)"]
2 [label="2 (shot)"]
2 -> 4 [label="dobj"]
2 -> 7 [label="nmod"]
2 -> 1 [label="nsubj"]
3 [label="3 (an)"]
4 [label="4 (elephant)"]
4 -> 3 [label="det"]
5 [label="5 (in)"]
6 [label="6 (my)"]
7 [label="7 (sleep)"]
7 -> 5 [label="case"]
7 -> 6 [label="nmod:poss"]
}
Riken Shah
  • 3,022
  • 5
  • 29
  • 56
RAVI
  • 3,143
  • 4
  • 25
  • 38
  • 2
    `dep.tree().draw()` does create a tree, however, I want a directed graph with edges being the relation between two terms, which is missing in this case . How can I create/display this? – Riken Shah Sep 06 '16 at 09:16
  • 2
    Then I think you want print dep.to_dot(), For layout : https://en.wikipedia.org/wiki/DOT_(graph_description_language)#Layout_programs – RAVI Sep 06 '16 at 10:20
  • Yes it worked, but now how can I add something into that graph without modifying the existing triple (since it returns unicode)? also how can I view graphically (like some image popping up etc. not just in terminal) ? is there any library? – Riken Shah Sep 06 '16 at 11:58
  • However, I will accept your answer since it partially solved my query. – Riken Shah Sep 06 '16 at 12:10
  • 1
    DOT is standard format. For layout you can check - Graphviz and others on wikipedia page which I mentioned in previous comment. – RAVI Sep 06 '16 at 13:45