Questions tagged [part-of-speech]

Linguistic category of words

In grammar, a part of speech (also a word class, a lexical class, or a lexical category) is a linguistic category of words (or more precisely lexical items), which is generally defined by the syntactic or morphological behaviour of the lexical item in question.

From http://en.wikipedia.org/wiki/Parts_of_speech

194 questions
0
votes
1 answer

Stanford Training Lambda Too Big

I am using Stanford POS Tagger to train a corpus. I prepared the settings file "Prop" and formated the data and started the training. After that, i started getting messages like "Lambda Too Big" and these messages kept occurring until the end of…
ykh
  • 1,775
  • 3
  • 31
  • 57
0
votes
1 answer

NLTK PoS tagging

I'm new in Python and need it for PoS tagging. Therefore I tried to use the standard tools. I tried to create a tagger and get a ValueError, that I don't understand. My code: import nltk tagged_sents = nltk.corpus.brown.tagged_sents(categories =…
P.S.
  • 3
  • 1
0
votes
0 answers

Tagging references/citations in text

I need to find a way to tag references to publications in text. We've been doing this via regex but it won't work these new patterns. Some examples (language is german): Herzog (August 2012), Einkommensteuerskriptum Band 1, S 8 Achatz/Bieber in…
pypat
  • 1,096
  • 1
  • 9
  • 19
0
votes
2 answers

Extracting sentences from a text document

I have a text document from which I'd like to extract the Noun phrases. In the first step I extract sentences and then I do a part of speech (pos) tagging for each sentence and then using the pos I do a chunking. I used StanfordNLP for these task,…
HHH
  • 6,085
  • 20
  • 92
  • 164
0
votes
1 answer

Part of speech tagging in OpenNLP vs. StanfordNLP

I'm new to part of speech (pos) taging and I'm doing a pos tagging on a text document. I'm considering using either OpenNLP or StanfordNLP for this. For StanfordNLP I'm using a MaxentTagger and I use english-left3words-distsim.tagger to train it. In…
HHH
  • 6,085
  • 20
  • 92
  • 164
0
votes
1 answer

How to assign a score to each chunk in a sentence?

I'm working on a keyword extraction task in which I'd like to extract phrases instead of words. In order to chunk each sentence into meaningful parts, I do a part of speech tagging first and them based on linguistic rule extract only the Noun…
HHH
  • 6,085
  • 20
  • 92
  • 164
0
votes
0 answers

How to use Brown corpus included in the NLTK toolkit to obtain the numbers and average numbers of words in specific grammar categories

I have a text document and I need to obtain the numbers and average numbers of words in specific grammar categories (e.g., adverbs, adjectives, verbs, pronouns) using Brown corpus included in the NLTK.
user5232014
0
votes
1 answer

Part of speech tagging and entity recognition - python

I want to perform part of speech tagging and entity recognition in python similar to Maxent_POS_Tag_Annotator and Maxent_Entity_Annotator functions of openNLP in R. I would prefer a code in python which takes input as textual sentence and gives…
0
votes
3 answers

Select only 'NN' and 'VB' words from NTLK pos_tag

I need to print only 'NN' and 'VB' words from an entered sentence. import nltk import re import time var = raw_input("Please enter something: ") exampleArray = [var] def processLanguage(): try: for item in exampleArray: …
Rajitha
  • 47
  • 1
  • 8
0
votes
1 answer

Forcing POS tags in Stanford CoreNLP

Is there a way to process an already POS-tagged text using Stanford CoreNLP? For example, I have the sentence in this format They_PRP are_VBP hunting_VBG dogs_NNS ._. and I'd like to annotate with lemma, ner, parse, etc. by forcing the given POS…
0
votes
1 answer

Stanford NLP: Chinese Part of Speech labels?

I am trying to find a table explaining each label in the Chinese part-of-speech tagger for the 2015.1.30 version. I couldn't find anything on this topic. The closest thing I could find was in the "Morphological features help POS tagging of unknown…
Kevin Zhao
  • 2,113
  • 2
  • 14
  • 18
0
votes
1 answer

POS accuracy of known and unknown words

How do I calculate the accuracy of known and unknown words in part of speech tagging? For example for known words, is it dividing the correctly tagged known words by all the known words ? Any other ways ?
math
  • 49
  • 1
  • 7
0
votes
3 answers

Comparing sub items of lists and making changes in Python

I have two lists originating from a part of speech tagger which look as follows: pos_tags = [('This', u'DT'), ('is', u'VBZ'), ('a', u'DT'), ('test', u'NN'), ('sentence', u'NN'), ('.', u'.'), ('My', u"''"), ('name', u'NN'), ('is', u'VBZ'), ('John',…
Markus
  • 43
  • 1
  • 4
0
votes
1 answer

Are there open source deep parsers for English which take as input and produce the parse tree?

I'm wondering if there are open source probabilistic deep parsers for English which take as input a sequence of tokens and their corresponding parts of speech (POS tags) as input, and produce the parse tree as results. The parsers I am aware of take…
0
votes
2 answers

GoLang PoS Tagger script taking longer than it should with no output in terminal

This script is compling without errors in play.golang.org: http://play.golang.org/p/Hlr-IAc_1f But when I run in on my machine, much longer than I expect happens with nothing happening in the terminal. What I am trying to build is a PartOfSpeech…
gramme.ninja
  • 1,341
  • 3
  • 11
  • 11
1 2 3
12
13