Questions tagged [stemming]

The process for reducing inflected words to their stem.

In linguistic morphology and information retrieval, stemming is the process for reducing inflected (or sometimes derived) words to their stem, base or root form—generally a written word form

531 questions

votes

1 answer

Elasticsearch : singular and plural results

We have used minimal_english stemmer filter in our mapping. This is to ensure that only singular and plural are searchable and not similar words. eg. Test and Tests should be searchable on entering the term - Test - but Tester,Testers,Testing should…

elasticsearch stemming

asked Dec 20 '12 at 11:54

Himadri Pant

2,171
21
22

votes

0 answers

Multi-language stemming in Haystack with ElasticSearch

I'd like to set stemming language on a per-user basis in Django Haystack with ElasticSearch as backend. In our Django model, we have image objects, that contain comma-separated tag charfield for English, Spanish, German, ...: tags_en, tags_es,…

django multilingual django-haystack stemming

asked Jun 11 '12 at 08:18

Simon Steinberger

6,605
5
55
97

votes

3 answers

Slovenian stemmer for Sphinx

I am searching stemming algorithm for Slovenian language that I can use with Sphinx search. What I'm trying to achieve is for example when searching for 'jabolka', I also want results for documents containing 'jabolko', 'jabolki', 'jabolk', etc. I…

php search full-text-search sphinx stemming

asked Jan 03 '12 at 14:50

KoviNET

votes

1 answer

Neither stemmer nor lemmatizer seem to work very well, what should I do?

I am new to text analysis and am trying to create a bag of words model(using sklearn's CountVectorizer method). I have a data frame with a column of text with words like 'acid', 'acidic', 'acidity', 'wood', 'woodsy', 'woody'. I think that 'acid' and…

python wordnet stemming lemmatization countvectorizer

asked May 16 '22 at 19:59

Rebecca James

votes

1 answer

In Solr, why is 'built' not being stemmed to 'build' but 'building' is?

I'm trying to figure out two things in this posting: Why is 'built' NOT being stemmed to 'build' even though the field type definition has a stemmer defined. However, 'building' is being stemmed to 'build' How to use Luke to examine the index to…

lucene solr stemming porter-stemmer

asked Aug 18 '11 at 01:10

jabawaba

votes

2 answers

Avoid slow highlighting on Solr because of stemming

I am quite new about using Solr, but would like to ask your help. I am developing an application which should be able to highlight the results of a query. For this I am using regex fragmenter:

solr highlighting stemming

asked Jul 29 '11 at 13:33

oroszgy

votes

2 answers

One word phrase search to avoid stemming in Solr

I have stemming enabled in my Solr instance, I had assumed that in order to perform an exact word search without disabling stemming, it would be as simple as putting the word into quotes. This however does not appear to be the case? Is there a…

search lucene solr stemming

asked Jun 02 '11 at 13:46

Ruth

5,646
12
38
45

votes

1 answer

R function doesn't loop through column but repeats first row result

I am trying to use the stemming function suggested in the corpus package stemming vignette here https://cran.r-project.org/web/packages/corpus/vignettes/stemmer.html but when I try to run the function on the entire column it seems to just be…

r function stemming

asked Nov 06 '19 at 20:22

Kreitzbe87

votes

1 answer

How can I apply stemming into a dictionary?

I'm working in some kind of NLP. I compare a daframe of articles with inputs words. The main goal is classify text if a bunch of words were found I've tried to extract the values in the dictionary and convert into a list and then apply stemming to…

python dictionary nlp stemming

asked Aug 12 '19 at 04:57

Chacho Fuva

votes

2 answers

Python nltk stemmers never remove prefixes

I'm trying to preprocess words to remove common prefixes like "un" and "re", however all of nltk's common stemmers seem to completely ignore prefixes: from nltk.stem import PorterStemmer, SnowballStemmer,…

python nlp nltk stemming porter-stemmer

asked Sep 02 '18 at 19:51

jon_simon

votes

1 answer

German stemmer is not removing feminine suffixes "-in" and "-innen"

In German, every job has a feminine and a masculine version. The feminine one is derived from the masculine one by adding an "-in" suffix. In the plural form, this turns into "-innen". Example: | English |…

python nlp nltk stemming snowball-stemmer

asked Jul 13 '18 at 01:17

sebrockm

5,733
2
16
39

votes

0 answers

Full-Text Seach and stemming on multilanguage column

I have a table with a column that contains data in different languages, like that: Id Text Language 1 name en 2 names en 3 имя ru 4 nom fr I need Full-text search for this multilingual column, but FTS is…

sql sql-server full-text-search stemming wordbreaker

asked Nov 06 '17 at 01:25

AtlasPromotion

votes

1 answer

English verbs processing ending with 'e'

I am implementing few string replacers, with these conversions in mind 'thou sittest' → 'you sit' 'thou walkest' → 'you walk' 'thou liest' → 'you lie' 'thou risest' → 'you rise' If I keep it naive it is possible to use regex for this case to find &…

python nlp stemming text-processing

asked Feb 07 '17 at 12:08

nehem

12,775
6
58
84

votes

3 answers

Word Base/Stem Dictionary

It seems my Google-fu is failing me. Does anyone know of a freely available word base dictionary that just contains bases of words? So, for something like strawberries, it would have strawberry. But does NOT contain abbreviations or misspellings or…

java dictionary nlp stemming

asked Oct 26 '10 at 15:19

AHungerArtist

9,332
17
73
109

votes

0 answers

How to use Whoosh to extract unstemmed keywords from a text?

I’m using Whoosh with Haystack. Haystack does not abstract the keyword extraction in Whoosh, so I’m using Whoosh directly for this feature. @property def keywords(self): whoosh_backend = SearchForm().searchqueryset.query.backend if not…

python keyword stemming whoosh

asked Jun 16 '16 at 13:17

Dawn Drescher

Prev 1 2 3

…

35 36 Next