How to linguistically parse English Text?

Question

Is there a way to linguistically parse English text? I mean get something like this?

I{I,pronoun} am{to be, verb, Present Simple} late{late, adverb}.

Or even better with dependencies, like:

I -> am -> (what?) -> late.

Better in Java, but it doesn't matter much.

Most proper parsers produce trees, like (S (((I pron subject) (am V-cop predicate)) (late adj predicative)), though there are other formalizations of dependencies, language models, etc. But this topic is far too wide for a StackOverflow question. — tripleee, Nov 14 '14 at 12:10

score 1 · Answer 1 · answered Nov 23 '14 at 20:39

1

The NLTK package is meant to do what you want : http://www.nltk.org/

import nltk
sentence="I'm late."
words=nltk.word_tokenize(sentence)
tagged=nltk.pos_tag(words)
>>>>tagged
[('I', 'PRP'), ("'m", 'VBP'), ('late', 'JJ'), ('.', '.')]

answered Nov 23 '14 at 20:39

GAM PUB

218
4
11

score 0 · Answer 2 · answered Nov 14 '14 at 12:03

There are a lot of linguistic dictionaries across the internet.

You should just download one of them, parse and use it for your needs...

You also should consider mistakes and other stuff that can take place , for this you should consider Natural language processing, look here

How to linguistically parse English Text?

2 Answers2