Highest Voted 'stemming' Questions

4

votes

1 answer

Italian Stemmer alternative to Snowball

I'm trying to analyze the texts in Italian in R. As you do in a textual analysis I have eliminated all the punctuation, special characters and Italian stopwords. But I have got a problem with Stemming: there is only one Italian stemmer (Snowball),…

r nlp stemming

asked Aug 21 '19 at 13:12

Danny Paganin

43
5

4

votes

1 answer

Is there a way to reverse stem in python nltk?

I have a list of stems in NLTK/python and want to get the possible words that create that stem. Is there a way to take a stem and get a list of words that will stem to it in python?

python nltk stemming

asked Jul 10 '19 at 21:07

JoeShmoe

43
4

4

votes

2 answers

Difference between Lucene stemmers: EnglishStemmer, PorterStemmer, LovinsStemmer

Have anybody compared these stemmers from Lucene (package org.tartarus.snowball.ext): EnglishStemmer, PorterStemmer, LovinsStemmer? What are the strong/weak points of algorithms behind them? When each of them should be used? Or maybe there are some…

java lucene stemming

asked Feb 21 '11 at 16:55

Paul Lysak

1,284
1
14
18

4

votes

4 answers

python, Stemmer not found

I got this code from github and this code will execute on windows machine 64 bit. Here's the error I get: Traceback (most recent call last): File "new.py", line 2, in import stemmer ModuleNotFoundError: No module named 'stemmer' import…

python-3.x python-import stemming

asked Apr 03 '18 at 16:05

saqibiqbal

43
1
1
4

4

votes

1 answer

Stemming words with NLTK (python)

I am new to Python text processing, I am trying to stem word in text document, has around 5000 rows. I have written below script from nltk.corpus import stopwords # Import the stop word list from nltk.stem.snowball import SnowballStemmer stemmer =…

python stemming

asked Aug 14 '17 at 08:39

user3734568

1,311
2
22
36

4

votes

3 answers

Stemming full strings on Python

I need to perform stemming on portuguese strings. To do so, i'm tokening the string using nltk.word_tokenize() function a then stemming each word individually. After that, I rebuild the string. It's working, but not performing well. How can i make…

python nlp nltk stemming

asked Jul 19 '17 at 00:38

yuridamata

459
1
5
13

4

votes

0 answers

Stemming Dutch words with the Kraaij-Pohlmann algorithm

I am trying to stem Dutch words in a corpus in R. I have found the SnowballC package, but this doesn't seem to work well for Dutch. For example: wordStem(c("huis", "huizen", "huisje", "huisjes"), language = "porter") [1] "huis" "huiz" …

r stemming

asked Jun 25 '17 at 11:58

Charlotte

41
4

4

votes

1 answer

Smart stemming/lemmatizing in Python for Nationalities

I am working with Python, and I would like to find the roots of some words, that mainly refer to countries. Some examples that demonstrate what I need are: Spanish should give me Spain. English should give me England. American should give me…

python nltk stemming lemmatization

asked Feb 03 '17 at 15:07

Adrian Monk

43
5

4

votes

4 answers

SQL word root matching

I'm wondering whether major SQL engines out there (MS SQL, Oracle, MySQL) have the ability to understand that 2 words are related because they share the same root. We know it's easy to match "networking" when searching for "network" because the…

sql nlp stemming lemmatization

asked Oct 29 '10 at 11:55

Max

12,794
30
90
142

4

votes

2 answers

Stop-word elimination and stemmer in python

I have a somewhat large document and want to do stop-word elimination and stemming on the words of this document with Python. Does anyone know an of the shelf package for these? If not a code which is fast enough for large documents is also…

python nlp stemming stop-words

asked Oct 07 '10 at 14:53

Hossein

40,161
57
141
175

4

votes

1 answer

StandardAnalyzer with stemming

Is there a way to integrate PorterStemFilter into StandardAnalyzer in Lucene, or do I have to copy/paste StandardAnalyzers source code, and add the filter, since StandardAnalyzer is defined as final class. Is there any smarter way? Also, if I would…

lucene stemming porter-stemmer

asked Sep 07 '14 at 20:28

Kobe-Wan Kenobi

3,694
2
40
67

4

votes

2 answers

Are Snowball & SnowballC packages different in R?

I am using stemDocument for stemming text document using tm package in R. Example code: data("crude") crude[[1]] stemDocument(crude[[1]]) I get an error message: Error in loadNamespace(name) : there is no package called ‘Snowball’ I have…

r stemming tm snowball

asked May 07 '14 at 20:58

Ram

331
1
3
11

4

votes

3 answers

MySQL fulltext with stems

I am building a little search function for my site. I am taking my user's query, stemming the keywords and then running a fulltext MySQL search against the stemmed keywords. The problem is that MySQL is treating the stems as literal. Here is the…

mysql full-text-search stemming

asked Jan 14 '10 at 04:11

johnnietheblack

13,050
28
95
133

4

votes

1 answer

NLTK words lemmatizing

I am trying to do lemmatization on words with NLTK. What I can find now is that I can use the stem package to get some results like transform "cars" to "car" and "women" to "woman", however I cannot do lemmatization on some words with affixes like…

python nlp nltk stemming lemmatization

asked Jul 16 '13 at 18:21

noben

531
1
7
16

4

votes

1 answer

Advanced Search Option in Solr corresponding to DtSearch options

We are replacing the search and indexing module in an application from DtSearch to Solr using solrnet as the .net Solr client library. We are relatively new to Solr/Lucene and would need some help/direction to understand the more advanced search…

solr solrnet fuzzy-search stemming advanced-search

asked Feb 07 '13 at 05:35

koder

887
9
29

Questions tagged [stemming]