Questions tagged [part-of-speech]

Linguistic category of words

In grammar, a part of speech (also a word class, a lexical class, or a lexical category) is a linguistic category of words (or more precisely lexical items), which is generally defined by the syntactic or morphological behaviour of the lexical item in question.

From http://en.wikipedia.org/wiki/Parts_of_speech

194 questions
0
votes
1 answer

Part-of-Speech tagging: what is the difference between known words and unknown words?

I am trying to understand the result evaluation table (table 1) of this paper. There are three different accuracies reported overall, unknown words (UW), known words (KW), and percentage of unknown words (% unk.). Are the known words the data that…
AziZ
  • 149
  • 1
  • 12
0
votes
1 answer

PyToch: ValueError: Expected input batch_size (256) to match target batch_size (128)

I've faced a ValueError while training a BiLSTM part of speech tagger using pytorch. ValueError: Expected input batch_size (256) to match target batch_size (128). def train(model, iterator, optimizer, criterion, tag_pad_idx): epoch_loss =…
NLP Dude
  • 3
  • 3
0
votes
1 answer

Loading manually annotated data to train RNN POS tagger

I've got a large manually annotated data. I would like to train a part of speech tagger using RNN. The data is something similar to the text below : Lorem Ipsum dummy text printing typesetting Ipsum Ipsum…
0
votes
1 answer

how to get tagset from nltk pos_tag?

I'm trying to get the full tag from nltk pos_tag, but I can't find a simple way to do it using nltk. For example, using tagsets='universal'. from nltk.tokenize import word_tokenize def nltk_pos(text): token = word_tokenize(text) return…
Y4RD13
  • 937
  • 1
  • 16
  • 42
0
votes
1 answer

Extract POS tag for a word coming before a given word

I am new in python and I am trying to extract Part of speech (Stanford CoreNLP) for a word coming before a given word. for the text = "انسان يحضر طعامه باستخدام الخبز الابيض وبجانبه قطة سوداء؟" here is my code for i in nouns: …
sawsan alzubi
  • 21
  • 1
  • 5
0
votes
1 answer

Search Corpus by Part-of-Speach

I am new to NLP. I am trying to search a corpus for Part-of-speech sequence. The goal would be to search for a sequence of POS tags and find all sentences that match sequence from a given corpus. Input: The quick brown fox jumped over the lazy dogs.…
alandalusi
  • 1,145
  • 4
  • 18
  • 39
0
votes
1 answer

Counter to return null-value if Part of Speech tag not present

Currently i am trying to count the instances a certain part of speech occurs in a given online review. While i am able to retrieve the specific tags corresponding to each word, and count these instances, i face difficulties in also capturing the…
Principia
  • 125
  • 1
  • 7
0
votes
1 answer

spacy instead of nltk for POS tagging

I have a function based on nltk.pos_tag that filters out collocations from text for only Adjective (JJ) and Noun (NN) together. f1=u'this is my random text' tokens = word_tokenize(f1) bigramFinder =…
0
votes
1 answer

How can I use the Stanford NLP Part-of-speech tagging in Spanish?

I am working with Stanford CoreNLP and I have a doubt. I want to determinate the grammatical category of each word and when I execute the text in the command line with: java -cp "*" -Xmx2g edu.stanford.nlp.pipeline.StanfordCoreNLP -props…
0
votes
1 answer

Spacy custom POS model for Hindi

I recently worked on training a Part-of-Speech model for Hindi in Spacy. I got the model already trained but when analyzing any text, the .pos_ attribute of any token always points to X. The fine-grained tags, .tag_ - which were the ones the model…
Adrián
  • 3
  • 1
0
votes
1 answer

How to pass part-of-speech in WordNetLemmatizer?

I am preprocessing text data. However, I am facing issue with lemmatizing. Below is the sample text: 'An 18-year-old boy was referred to prosecutors Thursday for allegedly stealing about ¥15 million ($134,300) worth of cryptocurrency last year…
0
votes
1 answer

extract pos_tag_sents from pandas series

following the advice from the thread How to apply pos_tag_sents() to pandas dataframe efficiently I run the code to identify different pos for the text in one of my variables. Now that I managed to create the column of interest - sub['POS'] - how…
Filippo Sebastio
  • 1,112
  • 1
  • 12
  • 23
0
votes
1 answer

Separate of nouns and groups of noun tag using nltk from json file

I want to find or separate noun and groups of nouns using NLTK from JSON file, this is the JSON file content: [ { "id": 18009, "ingredients": [ "baking powder", "eggs", "all-purpose flour", "raisins", "milk", …
0
votes
1 answer

Checking the order of a list of tuples

I have a list of tuples that are generated from a string using NLTK's PoS tagger. I'm trying to find the the "intent" of a specific string in order to append it to a dataframe, so I need a way to generate a syntax/grammar rule. string = "RED WHITE…
Sebastian Goslin
  • 477
  • 1
  • 3
  • 22
0
votes
1 answer

PolyAnalyst: Is there a list of Part of Speech Tags?

Can someone provide a list of the Pos Tagger tags assigned to the _tagged column that is created? We need to know all the possible values that can be assigned and what each one means. For example: En_NN = noun, En_NNS = plural noun, etc. Similarly…