1

I am using dependency parsing for a use case in R with the corenlp package. However, I need to tweak the dataframe for a specific use case.

I need a dataframe where I have three columns. I have used the below code to reach till the dependency tree.

devtools::install_github("statsmaths/coreNLP")
coreNLP::downloadCoreNLP()
initCoreNLP()
inp_cl = "generate odd numbers from column one and print."
output = annotateString(inp_cl)
dc = getDependency(output)

 sentence governor dependent      type governorIdx dependentIdx govIndex depIndex
1        1     ROOT  generate      root           0            1       NA        1
2        1  numbers       odd      amod           3            2        3        2
3        1 generate   numbers      dobj           1            3        1        3
4        1   column      from      case           5            4        5        4
5        1 generate    column nmod:from           1            5        1        5
6        1   column       one    nummod           5            6        5        6
7        1   column       and        cc           5            7        5        7
8        1 generate     print nmod:from           1            8        1        8
9        1   column     print  conj:and           5            8        5        8
10       1 generate         .     punct           1            7        1        10

Using POS tagging with the following code, I ended up with the following data frame.

ps = getToken(output)

ps = ps[,c(1,2,7,3)]

colnames(dc)[8] = "id"

dp = merge(dc, ps[,c("sentence","id","POS")], 
     by.x=c("sentence","governorIdx"),by.y = c("sentence","id"),all.x = T)

dp = merge(dp, ps[,c("sentence","id","POS")], 
     by.x=c("sentence","dependentIdx"),by.y = c("sentence","id"),all.x = T)

colnames(dp)[9:10] = c("POS_gov","POS_dep")


  sentence dependentIdx governorIdx governor dependent      type govIndex id POS_gov POS_dep
1         1            1           0     ROOT  generate      root       NA  1    <NA>      VB
2         1            2           3  numbers       odd      amod        3  2     NNS      JJ
3         1            3           1 generate   numbers      dobj        1  3      VB     NNS
4         1            4           5   column      from      case        5  4      NN      IN
5         1            5           1 generate    column nmod:from        1  5      VB      NN
6         1            6           5   column       one    nummod        5  6      NN      CD
7         1            7           5   column       and        cc        5  7      NN      CC
8         1            8           1 generate     print nmod:from        1  8      VB      NN
9         1            8           5   column     print  conj:and        5  8      NN      NN
10        1            9           1 generate         .     punct        1  9      VB       .

In case a verb(action word) is attached to a non-verb(non action word), but the non-verb(non-action word) is connected to other non-verb(non-action words) then one row should indicate the entire connection. Eg: generate is a verb connected to numbers and numbers is a non verb connected to odd.

So the intended data frame needs to be

Topic1 Topic2 Action
numbers odd    generate
column  from   generate
column  one    generate
column  and    generate
column  from   print
column  one    print
column  and    print
         .     generate
NinjaR
  • 621
  • 6
  • 22

1 Answers1

1

First you'll need to have your dependency tree tag print as a verb, rather than a noun.

Try using a sentence with two independent clauses, and see if the root of the second independent clause is tagged as such.

If so, it's a simple walk through the governoridx column. If not, you'll need to address the mechanics of your dependency tree generator.

John R
  • 1,505
  • 10
  • 18
  • Thanks for the response. I have edited the question showing the state I have arrived at. Could you please take a look and share a code sample that may help me out – NinjaR Feb 16 '18 at 06:18
  • My answer is still the answer. Your question is too broad for a specific "code sample". No one here is going to go find a multi-root dependency tree generator for you. Mark the answer correct, try to find one yourself, and then post another question at the next obstacle you hit. – John R Feb 16 '18 at 16:24
  • That was extremely helpful..thanks a ton...I will surely post when I hit my next obstacle. Great job making the decision for the entire SO population. Thanks again. – NinjaR Feb 16 '18 at 19:30