Questions tagged [part-of-speech]

Linguistic category of words

In grammar, a part of speech (also a word class, a lexical class, or a lexical category) is a linguistic category of words (or more precisely lexical items), which is generally defined by the syntactic or morphological behaviour of the lexical item in question.

From http://en.wikipedia.org/wiki/Parts_of_speech

194 questions
2
votes
3 answers

Painfully slow Postgres query using WHERE on many adjacent rows

I have the following psql table. It has roughly 2 billion rows in total. id word lemma pos textid source 1 Stuffing stuff vvg 190568 AN 2 her her appge 190568 …
Znusgy
  • 37
  • 1
  • 5
2
votes
2 answers

NLTK single-word part-of-speech tagging

Is there a way to use NLTK to get a set of possible parts of speech of a single string of letters, taking into account that different words might have homonyms? For example: report -> {Noun, Verb} , kind -> {Adjective, Noun} I have not been able…
Leland Reardon
  • 135
  • 2
  • 12
2
votes
2 answers

Handling (, ,) and (. .) and other punctuation when processing natural language parse trees with Lisp

My question has to do with post-processing of part-of-speech tagged and parsed natural language sentences. Specifically, I am writing a component of a Lisp post-processor that takes as input a sentence parse tree (such as, for example, one produced…
user3990797
  • 87
  • 1
  • 7
2
votes
0 answers

Stanford part-of-speech tagger cannot tag parentheses and quotation marks in pre-tokenized text

I have a pre-tokenized text as the input to Stanford part-of-speech tagger. It cannot tag parentheses and quotation marks correctly at all. I don't want Stanford Tagger's default tokenization, so I disabled it, using -tokenize false option. I know…
DehengYe
  • 619
  • 2
  • 8
  • 22
2
votes
2 answers

Stanford POSTagger with UIMA

I am trying to make a POSTagger(Part of speech) in UIMA pipeline.I have download the stanford POSTagger jar and attached it to the project and copied the models for english but it throws some exception. My Code: package…
Narendra Rawat
  • 353
  • 2
  • 5
  • 17
2
votes
2 answers

What tools can I use to find Part Of Speech Patterns

I am looking for tools to find Part Of Speech patterns on a corpus of documents. I am using the Stanford NLP tools for POS tagging my documents. Now I would like to query these tagged documents and find some specific POS patterns such as for…
azpublic
  • 1,404
  • 4
  • 20
  • 42
2
votes
0 answers

Part of speech search with lucene

After many googling searchs I decided to post my problem here hoping that someone help me. What I want to achieve is to perform queries as follows: q1: (adjective) "jumps" (preposition) // any adj followed by "jumps" followed by any prep. q2:…
andatarvid
  • 43
  • 3
2
votes
1 answer

Why doesn't paste with space separator work how expected?

I need to make sentences from a list of vectors of POS's. So I use paste with sep=' ' But this seems to have no affect on my result. Why? listPOS <- list(c("/NN", "/PDAT", "/VVFIN", "/VVPP", "./$."), c("/PPER", "/VVFIN", "/APPR",…
alex
  • 1,103
  • 1
  • 14
  • 25
2
votes
1 answer

Using the Stanford Dependency Parser on a previously tagged sentence

I'm currently using the Twitter POS tagger available here to tag out tweets into the Penn-Tree Bank tags. Here is that code: import java.util.List; import cmu.arktweetnlp.Tagger; import cmu.arktweetnlp.Tagger.TaggedToken; /* Tags the tweet text…
Danny Delott
  • 6,756
  • 3
  • 33
  • 57
2
votes
1 answer

Determine POS tagging in English based on database files

I'm a little bit confused how to determine part-of-speech tagging in English. In this case, I assume that one word in English has one type, for example word "book" is recognized as NOUN, not as VERB. I want to recognize English sentences based on…
2
votes
2 answers

Building a grammar from trees in python

I have a text corpus that contains sentences represented as trees with their Part of Speech tags. I want to build a system that can probably learn a probabilistic grammar from this tree structure. Are there any inbuilt python modules than can tackle…
roopalgarg
  • 429
  • 1
  • 6
  • 19
1
vote
1 answer

How do I retrieve phrases from a NLTK.tree using custom node labels?

Given a NLTK tree produced using the code below, how do I retrieve the leaf values (phrases) that potentially match all of the node labels assigned using the nltk.RegexParser (e.g. those phrases which match the Present_Indefinite or Present_Perfect…
Samar Pratap Singh
  • 471
  • 1
  • 10
  • 29
1
vote
1 answer

Why does a pretrained spacy pipe not work when added to a spacy.blank pipe?

I am trying to add spacy's already trained parser for Norwegian Bokmål to a blank spacy pipe. I get no error message when I add the pipe, but whatever the input, the pipe categorizes all tokens as nouns. What am I missing here? import spacy from…
1
vote
1 answer

List of dependencies in Spacy

I'm a beginner in NLP and i've decided to start with Spacy. It's simple to handle and to comprehend. Neverthless, i can't acess to the full documentation or parsing. I mean , i don't know the meaning of "IN" , "RB" for example And, displacy that is…
ZADI
  • 13
  • 2
1
vote
0 answers

Mapping from Wiktionary part-of-speech tags to 12 universal part-of-speech tags

Does anyone have a Python dict mapping from the Wiktionary part-of-speech tags to the 12 universal part-of-speech tags, along with a rationale for the mapping? The 12 universal tags are: VERB - verbs (all tenses and modes) NOUN - nouns (common and…