Detect English verb tenses using NLTK

Question

I am looking for a way given an English text count verb phrases in it in past, present and future tenses. For now I am using NLTK, do a POS (Part-Of-Speech) tagging, and then count say 'VBD' to get past tenses. This is not accurate enough though, so I guess I need to go further and use chunking, then analyze VP-chunks for specific tense patterns. Is there anything existing that does that? Any further reading that might be helpful? The NLTK book is focused mostly on NP-chunks, and I can find quite few info on VP-chunks.

There's a flaw in your logic. If a chunker can detect NP, then it must be able to detect VP. — Tim McNamara, Aug 09 '10 at 05:21
Of course but I am mostly interested in further VP analysis - how to make a difference between different tenses. — Michael Pliskin, Aug 09 '10 at 10:54

score 10 · Accepted Answer · edited Apr 20 '14 at 20:16

10

Thee exact answer depends on which chunker you intend to use, but list comprehensions will take you a long way. This gets you the number of verb phrases using a non-existent chunker.

len([phrase for phrase in nltk.Chunker(sentence) if phrase[1] == 'VP'])

You can take a more fine-grained approach to detect numbers of tenses.

edited Apr 20 '14 at 20:16

mobeets

450
3
13

answered Aug 09 '10 at 05:26

Tim McNamara

18,019
4
52
83

Thanks for the pointer, that's what I am gonna use - my next question is whether there is something existing to detect tense patterns. For each VP I'd like to know what tense is it in. – Michael Pliskin Aug 09 '10 at 10:55
2

I actually managed to solve my problem with this approach, so tagging this as accepted answer. The following article is really helpful: http://streamhacker.com/2009/02/23/chunk-extraction-with-nltk/ – Michael Pliskin Aug 16 '10 at 12:46
Hi Michael, great to hear that things are working well for you! – Tim McNamara Aug 17 '10 at 00:04

score 1 · Answer 2 · edited Aug 02 '17 at 17:28

1

You can do this with either the Berkeley Parser or Stanford Parser. But I don't know if there's a Python interface available for either.

edited Aug 02 '17 at 17:28

Jerry Stratton

3,287
1
22
30

answered Aug 09 '10 at 03:01

ars

120,335
23
147
134

1

Thanks a lot, this might be an option - however as I am heavily using NLTK already, it might be quite a lot of work to switch. Will look though. – Michael Pliskin Aug 09 '10 at 10:59
2

There is an interface for the Stanford Parser in the NLTK. You can use it as follows: `tagger = nltk.tag.stanford.POSTagger('models/german-fast.tagger', 'stanford-postagger.jar')` You may have to encode the strings to UTF-8 first (at least for the German model). – Suzana Mar 21 '13 at 16:44

Detect English verb tenses using NLTK

2 Answers2

Linked