Highest Voted 'stemming' Questions

0

votes

1 answer

Should I reindex documents in Elasticsearch when I change the Stemmer?

I am using Elasticsearch to index my documents (although I believe my question can apply to any other search engine such as Lucene or Solr as well). I am using Porter stemmer and a list of stop words at the index time. I know that I should apply the…

asked Oct 08 '14 at 16:09

Soheil

5,229
1
18
21

0

votes

1 answer

Most memory-efficient way to combine word stemming and the elimination of hash words in Perl?

I've patched together some Perl script intended to take each word from a batch of documents, eliminate all stop words, stem the remaining words, and create a hash containing each stemmed word and its frequency of occurrence. However, after working…

performance perl memory stemming stop-words

asked Sep 17 '14 at 19:28

Rick

107
2
12

0

votes

1 answer

Stem comparsion algorithm

I'm writing a program that makes word declension for Polish language. In this language stems can vary in some cases (because of palatalization or mobile/fleeting e and other effects). For example, we have word "karzeł" and it is basic dictionary…

algorithm nlp stemming

asked Sep 09 '14 at 08:22

Harry

144
2
9

0

votes

1 answer

Is there an option to toggle stemming in Solr?

I would like to have an option to turn stemming on and off in my searches using some toggling options. How can I do that ? Thanks, N

solr stemming

asked Jul 31 '14 at 13:14

user1748101

275
1
3
9

0

votes

1 answer

Stemmer the words in NLP

can anyone tell me which is the best stemmer. Also I have a text and i only want to stem the words which are in a list and leave the rest of tokens as it is. Below is my code. Text:swot del swot analys 2013 strengths weak brand nam valu at $ 7 .',…

python stemming

asked Jul 24 '14 at 17:49

Raghav Shaligram

309
4
11

0

votes

2 answers

Implementing Kstemmer

First I thank anyone who takes the time to help. The internet community is so essential for learning. Overall goal: I am inputting .txt file, stemming it using a Java build of The 2003 CIIR KStemmer in Eclipse, and outputting a list of stemmed…

java eclipse apache stemming

asked Jul 21 '14 at 23:53

user3862565

1
2

0

votes

0 answers

Stemming, lemmatization in python

I have checked all the other trails and used few of the solutions. I am facing a challenge in using port stemmer. I am trying to eliminate the affixs however port stemmer reduces the words into some weird forms like languages becomes languag and…

python stemming textblob

asked Jul 19 '14 at 07:17

Raghav Shaligram

309
4
11

0

votes

0 answers

Stemming csv files in Python

Okay, I have this code in Python in which it imports two csv files. The first csv file is named "claims" (one column, many rows) and the other one is named "sexualHarassment" (one column, many rows) The program right now checks all rows of "claims"…

python csv stop-words stemming

asked Jul 11 '14 at 19:19

Abtra16

145
1
4
12

0

votes

0 answers

how to stemming indonesian using lucene

i have tried to implemented class IndonesianAnalyzer from library org.apache.lucene.analysis.id.*; now i have a problem how to use initialize class indonesian analyzer to my project ,, my code like this ??? import java.io.*; import…

java lucene stemming

asked Jun 04 '14 at 18:06

user3708259

11
2

0

votes

1 answer

Mapping of words to stemmed words (Stem dictionary)

I want to generate a mapping of ( word-stemmed word ) which il need for my project. I am trying to generate the mapping this way 1.i took a text ( in file 1),used rapid miner to stem all the words and saved the resulting text in another file say…

java nlp rapidminer stemming

asked Jun 04 '14 at 06:10

user3290349

1,227
1
9
17

0

votes

1 answer

Lucene project fatal error

I have a lot of text message, I run below lines of codes for them. // tokenize term TokenStream tokenStream = new ClassicTokenizer(LUCENE_VERSION, new StringReader(term)); // stemmize tokenStream = new PorterStemFilter(tokenStream); SOMETIMES i…

java lucene stemming porter-stemmer

asked May 02 '14 at 11:00

user3582044

15
4

0

votes

1 answer

error message while stemming for sentiment analysis

I do stemming on my dataset for sentiment analysis and I got this error message "Error in structure(if (length(n)) n else NA, names = x) : 'names' attribute [2] must be the same length as the vector [1]" Please…

r sentiment-analysis stemming

asked Apr 16 '14 at 22:59

user3456230

217
4
13

0

votes

1 answer

StanfordCoreNLP does not work in my way

I use below code. However, the outcome is not what I expected. The outcome is [machine, Learning] But I want to get [machine, learn]. How can I do this? Also, when my input is "biggest bigger", I wanna get the result like [big, big], but the outcome…

java nlp stanford-nlp stemming lemmatization

asked Apr 15 '14 at 14:39

CSnerd

2,129
8
22
45

0

votes

2 answers

A simple stemming algorithm with String for input

I've been looking at word stemming algorithms such as the porter algorithm, but everything I've found so far has dealt with files as input. Are there any existing algorithms which would let me simply pass the stemmer a string, and have it return the…

java algorithm stemming porter-stemmer

asked Mar 25 '14 at 14:23

user3163073

11
1
3

0

votes

1 answer

Stemming demonyms in Solr (Russian => Russia)

Trying to match queries containing "russia" or "russian" to "Russian Federation" using Solr (as well as other country demonyms, such as "american", "syrian" etc). What is a good way to handle this without adding synonyms for each country, and…

solr lucene stemming

asked Mar 19 '14 at 21:25

Neil McGuigan

46,580
12
123
152

Questions tagged [stemming]