6

I'm using the Stanford Tagger for determining the Parts of Speech. However, I want to get more information out of the text. Is there a possibility to get further information like the tense of the sentence or if it is in active/passive?

So far, I'm using the very basic PoS-Tagging approach:

List<List<TaggedWord>> taggedUnits = new ArrayList<List<TaggedWord>>();

String input = "This sentence is going to be future. The door was opened.";
for (List<HasWord> sentence : MaxentTagger.tokenizeText(new StringReader(input)))
{
     taggedUnits.add(tagger.tagSentence(sentence));
}
David Müller
  • 5,291
  • 2
  • 29
  • 33

1 Answers1

20

You can get tense information from the various penn tags:

27. VB  Verb, base form
28. VBD Verb, past tense
29. VBG Verb, gerund or present participle
30. VBN Verb, past participle
31. VBP Verb, non-3rd person singular present
32. VBZ Verb, 3rd person singular present

About the active/passive aspect, you can use typed dependencies included in Stanford Core NLP.

  1. If the sentence is in active voice, a 'nsubj' dependecy should exist.
  2. If the sentence is in passive voice a 'nsubjpass' dependency should exist

Hope this helps.

bogs
  • 2,286
  • 18
  • 22
  • Thank you very much for your help! However, I got stuck when using German for "active/passive detection" -> http://stackoverflow.com/questions/19531208/how-to-use-stanford-corenlp-with-a-non-english-parse-model – David Müller Oct 23 '13 at 01:34
  • been reading the docs on this, and this nsubjpass relationship seems to be a feature of all passive sentences - http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/trees/EnglishGrammaticalRelations.html#NOMINAL_PASSIVE_SUBJECT – JasTonAChair Feb 16 '16 at 12:45
  • This is very useful, but isn't the full story because both can turn up. For example "They spoke no more until camp was made." I get nsubjpass for 'camp' and nsubj for 'They'. Would it be reasonable to assume the earlier one in the sentence is more important? – havlock Mar 28 '19 at 08:23