1

I'm trying to build ternary intra sentence relationships. One of the methods I'm considering is shortest path dependency algorithm with pos tag sequence , shortest path dependency sequence which will be used as features to a kernel based SVM. I'm not sure on how to formulate these features.

txt='Domestic revenues increased 14% to $680.8 million and were 77% of total revenues for the year ended December 31, 2015.'

doc = nlp(txt)
for token in doc:
    print((token.head.text, token.text, token.dep_,token.pos_))
edges = []
for token in doc:
    for child in token.children:
        edges.append(('{0}'.format(token.lower_),
                      '{0}'.format(child.lower_)))

graph = nx.Graph(edges)

the shortest path between second token of domestic revenues and 2015 looks like this

shortest path length :7
shortest path: ['revenues', 'increased', 'were', 'for', 'year', 'ended', 'december', '2015']
  1. How do I use this dependency graph as a feature sequence for ternary relationship ? ( Audit-nsubj-increased-quantmod-million--conj-were )

  2. How do i use generalized pos tags for these entity relationships (Audit-verb-num-num).

Since the entities in question are compound Im ok for the model to classify last tokens of the entities as a ternany relationship like this: (revenues,million,2015)--> (Audit,value,data)

Sam H.
  • 4,091
  • 3
  • 26
  • 34
sr33kant
  • 35
  • 3

0 Answers0