Highest Voted 'stemming' Questions

3

votes

2 answers

Stemming process not working in Python

I have a text file that I am trying to stem after having removed stopwords but it seems that nothing changes when I run it. My file is called data0. Here are my codes: ## Removing stopwords and tokenizing by words (split each word) from nltk.corpus…

python stemming

asked Apr 01 '16 at 11:01

Economist_Ayahuasca

1,648
24
33

3

votes

0 answers

Stemming for Polish language using Google App Engine Python Search Api

I'm trying to use Python Search Api in Google App Engine to search through set of Polish documents and I found, that stemming feature is not working as expected. The word "red" in English has only one form, although there are different forms of it…

python google-app-engine full-text-search stemming google-app-engine-python

asked Mar 30 '16 at 21:19

Pastuszka Przemek

73
7

3

votes

0 answers

Elasticsearch snowball in French not stemming correctly

I've seen a problem with the same stem word in French. Here is an example: snowball in French or curl -XDELETE http://localhost:9200/stacko36088193 curl -XPOST http://localhost:9200/stacko36088193 -d ' { "index": { "number_of_shards": 1, …

elasticsearch stemming snowball

asked Mar 18 '16 at 15:23

Roukmoute

681
1
11
26

3

votes

1 answer

How to Stem Shakespere/KJV Using nltk.stem.snowball

I want to stem early modern English text: sb.stem("loveth") >>> "lov" Apparently, all I need to do is a small tweak to the Snowball Stemmer: And to put the endings into the English stemmer, the list ed edly ing ingly of Step 1b should be…

python nlp nltk stemming snowball

asked Feb 29 '16 at 02:14

Joseph

691
1
4
12

3

votes

1 answer

In the Porter Stemming algorithm, what is the purpose of including an identity rule such as SS -> SS?

What is the point of the Porter Stemmer algorithm having a rule the converts SS to SS?

algorithm information-retrieval stemming

asked Oct 07 '15 at 16:41

CodyBugstein

21,984
61
207
363

3

votes

1 answer

Snowball Stemming: defining Regions

I'm trying to understand the snoball stemming algorithmus. The algorithmus is using two regions R1 and R2 that are definied as follows: R1 is the region after the first non-vowel following a vowel, or is the null region at the end of the word if…

nlp stemming linguistics porter-stemmer snowball

asked Aug 06 '15 at 06:13

HW90

1,953
2
21
45

3

votes

1 answer

Solr how can I have the original term first than the stemmed version?

I have been trying to get the exact key matched result first in the Solr 5.0.0 result. For Example, Meditation Bowls Goddess Bowls Celestial Bowls Bowling Green 33 Bowls Tibetan Singing Bowls Dust Bowl Revival Bowl of Stars If I search for a word…

java solr lucene solrj stemming

asked Jun 29 '15 at 10:10

User123

91
6

3

votes

1 answer

Are there any Lucene stemmers that handle Shakespearean English?

I'm trying to index some old documents for searching -- 16th, 17th, 18th century. Modern stemmers don't seem to handle the antiquated word endings: worketh, liveth, walketh. Are there stemmers that specialize in the English from the time of…

solr lucene nlp stemming

asked Jun 25 '15 at 17:34

Eric Wilson

57,719
77
200
270

3

votes

1 answer

How to split a text into two meaningful words in R

this is the text in my dataframe df which has a text column called 'problem_note_text' SSCIssue: Note Dispenser Failureperformed checks / dispensor failure / asked the stores to take the note dispensor out and set it back / still error message…

r split stemming text-analysis

asked Jun 22 '15 at 15:07

Shweta Kamble

432
2
10
21

3

votes

2 answers

Is it possible to get a natural word after it has been stemmed?

I have a word play which after stemming has become plai. Now I want to get play again. Is it possible? I have used Porter's Stemmer.

nlp stemming porter-stemmer

asked Feb 10 '15 at 18:52

odbhut.shei.chhele

5,834
16
69
109

3

votes

1 answer

How Do I Use BrazilianStemmer in Lucene 4?

i'm trying to tokenize and stem a portuguese sentence using Lucene 4. Based on this [thread] (How to use a Lucene Analyzer to tokenize a String?) i was abble to correctly tokenize a portuguese sentence. However, no stemming were been applied. Thus,…

java lucene nlp information-retrieval stemming

asked Dec 13 '14 at 23:57

user3444287

31
3

3

votes

1 answer

multiple results of one variable when applying tm method "stemCompletion"

I have a corpus containing journal data of 15 observations of 3 variables (ID, title, abstract). Using R Studio I read in the data from a .csv file (one line per observation). When performing some text mining operations I got some trouble when using…

r rstudio tm stemming

asked Oct 05 '14 at 16:23

Dobby

75
5

3

votes

1 answer

Morphology:Tool to get the root word and suffix for a given english word

I am trying to do morph analysis in POS tagging. Is there any tool (which I can call from within a python or java script) which returns the Root form and its suffix , when we call it by passing an English word as parameter. For example: if I give…

nlp nltk wordnet stemming morphological-analysis

asked Sep 17 '14 at 18:50

vidya sagar Kushwaha

81
1
6

3

votes

1 answer

Hunspell affix condition regex format. Any way to match the start?

Good day. I'm trying to use Hunspell as a stemmer in my application. I don't quite like porter and snowball stemming because of their "chopped" words results like "abus", "exampl". Lemmatizing seems like a good alternative, but I don't know any good…

nlp stemming hunspell

asked Sep 02 '14 at 00:27

SimpleV

396
4
14

3

votes

1 answer

Turn stemming off in Lucene

I need to turn off the stemming of the EnglishAnalyzer or other similar analyzers (such as the ItalianAnalyzer, ecc..)I'm using Lucene 3.6.2 and i saw that is only possible to specify a set of words that should not be stemmed using this…

java lucene stemming

asked May 02 '14 at 23:26

Luca Mastrostefano

3,201
2
27
34

Questions tagged [stemming]