Highest Voted 'stemming' Questions

2

votes

1 answer

Lucene synonym expansion,stemming,spell check and more

I am using Lucene to index my database and then perform a phrase search on a specific field(field name: keyword). I am using following code currently: String userQuery = request.getParameter("query"); //create standard analyzer…

asked Aug 13 '11 at 21:50

Prim

1,312
5
25
51

2

votes

1 answer

How to get a nested list by stemming the words inside the nested lists?

I've a Python list with several sub lists having tokens as tokens. I want to stem the tokens in it so that the output will be as stemmed_expected. tokens = [['cooked', 'lovely','baked'],['hotel',…

python list for-loop nested-lists stemming

asked Dec 05 '21 at 04:33

Dakshila Kamalsooriya

1,391
4
17
36

2

votes

1 answer

Stemming and lemmatizing - What approach?

I am preparing to do topic modeling via Mallet and have finished pulling the raw datasets. Before I import and start modeling, I need to take some steps to clean and streamline the texts, of course. I have my lists of stopwwords ready and I know…

lda topic-modeling stemming lemmatization mallet

asked Jun 28 '21 at 18:16

Glorifier

31
1

2

votes

1 answer

Solr search/faceting results have strange behaviour: i only get "stemmed" strings (hope it's correct definition)

Sorry for a title that bad, but i didn't know how to describe my problem. I'm using sunburnt (python interface) to query solr within my django app. When i'm searching, everything is ok, i get the full string. On the other hand, if i'm faceting…

filter solr stemming facet

asked Jul 21 '11 at 13:24

Samuele Mattiuzzo

10,760
5
39
63

2

votes

2 answers

How do I Get All Attributes Of Synsets?

Please Give Me am example That have all of attribute of synset of a word i know only this attribute: name , lemma_names , definition synsetsWord = ObjWn.synsets( 'Book' ) i = 0 for senseWord in synsetsWord: …

python nltk wordnet stemming

asked Jul 12 '11 at 04:39

Masoud Abasian

10,549
6
23
22

2

votes

0 answers

Elasticsearch German stemmer doesn't do plural

I'm working on a basic German analyzer in Elasticsearch which is defined as follows { "settings": { "analysis": { "filter": { "german_stemmer": { "type": "snowball", "language": "German" }, …

elasticsearch analyzer stemming

asked Feb 02 '21 at 09:49

Lior Magen

1,533
2
15
33

2

votes

2 answers

Exact word search in Solr

I have a question which closely relates to this question. In my schema I have a field This gives an exact match, ie. stemming disabled eat = eat Is it possible,…

search lucene solr stemming

asked Jun 21 '11 at 16:14

Ruth

5,646
12
38
45

2

votes

4 answers

How to find basic, uninflected word for searching?

I am having trouble trying to write a search engine that treats all inflections of a word as the same basic word. So for verbs these are all the same root word, be: number/person (e.g. am; is; are) tense/mood like past or future tense (e.g.…

perl search nlp stemming lemmatization

asked May 31 '11 at 17:30

Jon

757
5
20

2

votes

1 answer

Porter and Lancaster stemming clarification

I am doing stemming using Porter and Lancaster and I find these observations: Input: replied Porter: repli Lancaster: reply Input: twice porter: twice lancaster: twic Input: came porter: came lancaster: cam Input: In porter: …

nlp nltk stemming porter-stemmer nltk-book

asked Feb 25 '20 at 03:55

floss

2,603
2
20
37

2

votes

1 answer

What is the real purpose of Stemming in NLP?

I know about stemming and lemmatizing as follows: stemming - converts words into non-changing portions;amusing, amusement - amus lemmatizing - converts words to dictionary form ; amusing, amusement - amuse I can understand why to use lemmatization.…

nlp stemming lemmatization

asked Jan 23 '20 at 06:44

Karanam Krishna

365
2
16

2

votes

1 answer

How to exclude certain names and terms from stemming (Python NLTK SnowballStemmer (Porter2))

I am newly getting into NLP, Python, and posting on Stackoverflow at the same time, so please be patient with me if I might seem ignorant :). I am using SnowballStemmer in Python's NLTK in order to stem words for textual analysis. While…

python nlp nltk stemming lemmatization

asked Dec 10 '19 at 11:56

ylimenibor

23
2

2

votes

1 answer

English stemming or lemmatization in Lucene.NET without SnowBall Analyzer or a custom analyzer

Is there a non-obsolete Lucene.NET Analyzer that can do english language stemming or lemmatization or do I need to write a custom Analyzer? I can't seem to find an Analyzer that includes PorterStemFilter or EnglishMinimalStemFilter in the source…

lucene.net stemming

asked Aug 10 '19 at 03:17

Justin Dearing

14,270
22
88
161

2

votes

1 answer

Lemmatisation of web scraped data

Let's suppose that I have a text document such as the following: document = '

I am a sentence. I am another sentence

I am a third sentence.' ( or a more complex text example: document = '

Forde Education are looking to recruit a Teacher of…

python nlp text-parsing stemming lemmatization

asked Mar 22 '19 at 10:28

Outcast

4,967
5
44
99

2

votes

1 answer

How is the correct use of stemDocument?

I have already read this and this questions, but I still didn't understand the use of stemDocument in tm_map. Let's follow this example: q17 <- VCorpus(VectorSource(x = c("poder", "pode")), readerControl = list(language = "pt", …

r text-mining tm stemming snowball

asked Jan 15 '19 at 11:06

Guilherme Parreira

944
9
29

2

votes

2 answers

Why is stemming important for sentimental analysis

I am using seven lexicons to calculate sentimental scores on a data set containing forum posts. Apart from removing all noise such as whitespace, special char, digits and stopwords, why is it also important to stem the words? I am using Harvard.IV,…

r sentiment-analysis text-analysis stemming

asked Nov 04 '18 at 21:49

Ola

81
1
8

Questions tagged [stemming]