Questions tagged [stemming]

The process for reducing inflected words to their stem.

In linguistic morphology and information retrieval, stemming is the process for reducing inflected (or sometimes derived) words to their stem, base or root form—generally a written word form

531 questions
0
votes
2 answers

Search in Solr with also hashtag included

Suppose if I search in solr with keyword IPL, I want results that include both IPL and #IPL. How to attain this? I tried WordDelimFactory like this below in index and query but didn't work out . I think I have to split the string to "string" and…
Babu
  • 165
  • 3
  • 12
0
votes
2 answers

NLTK: Sentiment Analysis and Stemming

I am working on a code for Sentiment Analysis. Now I would like to use a Stemmer in my code snippet, but when I use print function, the results show that the stemming does not work. Do you have any idea what I am doing wrong? Here is my code…
Tommy5
  • 1
  • 2
0
votes
1 answer

How to prepare feature vectors for text classification when the words in the text is not frequently repeating?

I need to perform the text classification on set of emails. But all the words in my text are thinly sparse i.e frequency of each word with respect to all the documents are very less. words are not that much frequently repeating. Since to train the…
0
votes
1 answer

What should be the outcome of stemming a word with apostrophe?

I'm using nltk.stem.porter.PorterStemmer in python to get stems of words. When I get the stem of "women" and "women's" I get different results respectively: "women" and "women'". For my purposes I need to have both words having the same stem. In my…
Diego Aguado
  • 1,604
  • 18
  • 36
0
votes
1 answer

How to make snowball greedy between two matches?

I have 2 routines that should be completely parallel. I want Snowball to execute them and choose the one with the longest match. Currently, I run them using or. That means execute the first, if fails execute the second. I thought of perform a…
Assem
  • 11,574
  • 5
  • 59
  • 97
0
votes
1 answer

Custom 'stemming' in Sphinx with Workforms?

I've found the stem_en and lemmatizer to be either to limiting or inclusive for my needs. Can I make custom stemming with word forms? Either full workds e.g. Procology > Procotologist but idealy stems ology > ologist
user3649739
  • 1,829
  • 2
  • 18
  • 28
0
votes
1 answer

Override a stemmed word on the fly in a query with Spinx?

If I turn on stemming/lemmatizer in sphinx can I push a term to it "as needed" that does not utilize stemming? I know I can use wordforms to always ignore that word from stemming e.g. Radiology > Radiology but that results in never stemming the…
user3649739
  • 1,829
  • 2
  • 18
  • 28
0
votes
1 answer

Can you change priority between wordform and lemmatizer in Sphinx?

If I turn lemmatizer on then plurals all work e.g Office=Offices Dog=Dogs However if I make a wordform unrelated to plural like 100 > Hundred Then Hundred will not match Hundreds (I realize not a perfect example so don't take it literally). So the…
user3649739
  • 1,829
  • 2
  • 18
  • 28
0
votes
1 answer

Sphinx Stemming "finicky"?

I just installed the en.pak and (as per a recently posted question) I matched Radiography to Radiograph Radiographic Radiograper However I rotated again and now somehow it stopped working. The only error message I see is "WARNING: index 'idx_X':…
user3649739
  • 1,829
  • 2
  • 18
  • 28
0
votes
1 answer

How activate stemming on my Lucene search code

Can someone please help activate stemming on my code. Tried a lot but without much success :( My current code Directory createIndex(DataTable table) { var directory = new RAMDirectory(); using (Analyzer analyzer = new…
Harish Mohanan
  • 184
  • 1
  • 2
  • 15
0
votes
0 answers

Perform a SPARQL query using stemmed words

my question is about how to manage stemmed words in SPARQL queries. For instance: string inserted by user: bbbx stemmed word: bbb word to retrieve in my graph: bbby This could by pretty easy. Second example, both words should be fetched, not…
RobMor
  • 57
  • 10
0
votes
2 answers

Finding all words in a paragraph whose first three letters are the same?

How can we solve this problem in a best way? Is there any algorithm for solving this? "In a paragraph we have to find and print all the words which have starting 3 letters same. Example: we input some paragraph and as a output we get letters…
0
votes
1 answer

Solr search for different words with same meaning

I am using Solr with Solarium and I am trying to implement searching for different words with the same meaning. For example, if a user searched for photo, it would also return results for photograph and photographs. I have tried Implementing…
kyle
  • 1
  • 2
0
votes
0 answers

Use Porter's stemmer from a List?

I have a List of words that I am trying to stem. How can I use the Porter Stemmer with the List? I only found information about stemming using a text file, but I am not sure how to adapt it for the List.
Chechi
  • 13
  • 1
  • 2
  • 4
0
votes
1 answer

How to configure SOLR for stemming

I am learning solr and using solr-5.3.0. I want to include common stemmers in solr. I followed this Tutorial. But after making changes to the schema.xml when I search for a term I didn't get desired output. ALso, there are many schema.xml and I am…
user4974500
  • 139
  • 1
  • 2
  • 12