0

I want to find out whether an action has been carried out if will be carried out from a series of sentences. For example: "I will prescribe this medication" versus "I prescribed this medication" or "He had already taken the stuff" versus "he may take the stuff later"

I was trying a tidytext approach and decided to simply look for past participle versus future participle verbs. However when I POS tag using the only types of verbs I get are "Verb intransitive", "Verb (usu participle)" and "Verb (transitive)". How can I get an idea of past or future verbs or is there another POS tagger I can use?

I am keen to use tidytext because I cannot install rjava which some of the other text mining packages use.

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
Sebastian Zeki
  • 6,690
  • 11
  • 60
  • 125
  • You could use the package udpipe as a POS tagger. But this will return the word "may" as an auxiliary, not as a future verb – phiver Feb 18 '19 at 18:16
  • Look to the morphological features of the udpipe output. It provides you these details. –  Feb 19 '19 at 18:59

1 Answers1

1

Look at the morphological features from the udpipe annotation. These are put in the feats column of the annotation. And you can put these as extra columns in the dataset by using cbind_morphological. All the features are defined at https://universaldependencies.org/u/feat/index.html You'll see below that prescribed from the sentence 'I prescribed this medication' is past tense as well as the word taken and had from 'he had already taken'.

library(udpipe)
x <- data.frame(doc_id = 1:4, 
                text = c("I will prescribe this medication", 
                         "I prescribed this medication", 
                         "He had already taken the stuff", 
                         "he may take the stuff later"), 
                stringsAsFactors = FALSE)
anno <- udpipe(x, "english")
anno <- cbind_morphological(anno)

anno[, c("doc_id", "token", "lemma", "feats", "morph_verbform", "morph_tense")]

 doc_id      token      lemma                                                  feats morph_verbform morph_tense
      1          I          I             Case=Nom|Number=Sing|Person=1|PronType=Prs           <NA>        <NA>
      1       will       will                                           VerbForm=Fin            Fin        <NA>
      1  prescribe  prescribe                                           VerbForm=Inf            Inf        <NA>
      1       this       this                               Number=Sing|PronType=Dem           <NA>        <NA>
      1 medication medication                                            Number=Sing           <NA>        <NA>
      2          I          I             Case=Nom|Number=Sing|Person=1|PronType=Prs           <NA>        <NA>
      2 prescribed  prescribe                       Mood=Ind|Tense=Past|VerbForm=Fin            Fin        Past
      2       this       this                               Number=Sing|PronType=Dem           <NA>        <NA>
      2 medication medication                                            Number=Sing           <NA>        <NA>
      3         He         he Case=Nom|Gender=Masc|Number=Sing|Person=3|PronType=Prs           <NA>        <NA>
      3        had       have                       Mood=Ind|Tense=Past|VerbForm=Fin            Fin        Past
      3    already    already                                                   <NA>           <NA>        <NA>
      3      taken       take                               Tense=Past|VerbForm=Part           Part        Past
      3        the        the                              Definite=Def|PronType=Art           <NA>        <NA>
      3      stuff      stuff                                            Number=Sing           <NA>        <NA>
      4         he         he Case=Nom|Gender=Masc|Number=Sing|Person=3|PronType=Prs           <NA>        <NA>
      4        may        may                                           VerbForm=Fin            Fin        <NA>
      4       take       take                                           VerbForm=Inf            Inf        <NA>
      4        the        the                              Definite=Def|PronType=Art           <NA>        <NA>
      4      stuff      stuff                                            Number=Sing           <NA>        <NA>
      4      later      later                                                   <NA>           <NA>        <NA>