I want to extract actions done on objects from a list of sentences in R. To give a small overview.
S = “The boy opened the box. He took the chocolates. He ate the chocolates.
He went to school”
I am looking for combinations as follows:
Opened box
Took chocolates
Ate chocolates
Went school
I have been able to get the verbs and the nouns extracted individually. But can’t figure out a way to combine them to get such insights.
library(openNLP)
library(openNLPmodels.en)
library(NLP)
s = as.String("The boy opened the box. He took the chocolates. He ate the
chocolates. He went to school")
tagPOS<- function(x, ...) {
s <- as.String(x)
word_token_annotator<- Maxent_Word_Token_Annotator()
a2 <- Annotation(1L, "sentence", 1L, nchar(s))
a2 <- annotate(s, word_token_annotator, a2)
a3 <- annotate(s, Maxent_POS_Tag_Annotator(), a2)
a3w <- a3[a3$type == "word"]
POStags<- unlist(lapply(a3w$features, `[[`, "POS"))
POStagged<- paste(sprintf("%s/%s", s[a3w], POStags), collapse = ",")
list(POStagged = POStagged, POStags = POStags)
}
nouns = c("/NN", "/NNS","/NNP","/NNPS")
verbs = c("/VB","/VBD","/VBG","/VBN","/VBP","/VBZ")
s = tolower(s)
s = gsub("\n","",s)
s = gsub('"',"",s)
tags = tagPOS(s)
tags = tags$POStagged
tags = unlist(strsplit(tags, split=","))
nouns_present = tags[grepl(paste(nouns, collapse = "|"), tags)]
nouns_present = unique(nouns_present)
verbs_present = tags[grepl(paste(verbs, collapse = "|"), tags)]
verbs_present = unique(verbs_present)
nouns_present<- gsub("^(.*?)/.*", "\\1", nouns_present)
verbs_present = gsub("^(.*?)/.*", "\\1", verbs_present)
nouns_present =
paste("'",as.character(nouns_present),"'",collapse=",",sep="")
verbs_present =
paste("'",as.character(verbs_present),"'",collapse=",",sep="")
The idea is to build a network graph where on clicking on an verb node, all the objects attached to it will come up and vice versa. Any help on this would be great.