Questions tagged [stemming]

The process for reducing inflected words to their stem.

In linguistic morphology and information retrieval, stemming is the process for reducing inflected (or sometimes derived) words to their stem, base or root form—generally a written word form

531 questions
12
votes
4 answers

Base word stemming instead of root word stemming in R

Is there any way to get base word instead of root word in stemming using NLP in R? Code: > #Loading libraries > library(tm) > library(slam) > > #Vector > Vec=c("happyness happies happys","sky skies") > > #Creating Corpus >…
AVSuresh
  • 1,562
  • 2
  • 15
  • 14
12
votes
2 answers

German Stemming for Sentiment Analysis in Python NLTK

I've recently begun working on a sentiment analysis project on German texts and I'm planning on using a stemmer to improve the results. NLTK comes with a German Snowball Stemmer and I've already tried to use it, but I'm unsure about the results.…
Florian
  • 155
  • 1
  • 9
12
votes
1 answer

Python stemming (with pandas dataframe)

I created a dataframe with sentences to be stemmed. I would like to use a Snowballstemmer to obtain higher accuracy with my classification algorithm. How can I achieve this? import pandas as pd from nltk.stem.snowball import SnowballStemmer # Use…
Chiel
  • 662
  • 1
  • 7
  • 30
12
votes
3 answers

Stemming with R Text Analysis

I am doing a lot of analysis with the TM package. One of my biggest problems are related to stemming and stemming-like transformations. Let's say I have several accounting related terms (I am aware of the spelling issues). After stemming we…
RUser
  • 588
  • 1
  • 4
  • 17
12
votes
4 answers

R stemming a string/document/corpus

I'm trying to do some stemming in R but it only seems to work on individual documents. My end goal is a term document matrix that shows the frequency of each term in the document. Here's an…
screechOwl
  • 27,310
  • 61
  • 158
  • 267
11
votes
1 answer

Is there a good stemmer for Hebrew?

I am looking for a good stemmer for Hebrew - I found nothing at all using Google... On the HebMorph site it says that: Stem and Lemma originally have different meanings, but for Semitic languages they seem to be used interchangeably. Does that mean…
Cheshie
  • 2,777
  • 6
  • 32
  • 51
11
votes
1 answer

Effects of Stemming on the term frequency?

How are the term frequencies (TF), and inverse document frequency (IDF), affected by stop-word removal and stemming? Thanks!
Ataman
  • 2,530
  • 3
  • 22
  • 34
10
votes
5 answers

I want a Java Arabic stemmer

I'm looking for a Java stemmer for Arabic. I found a lib called "AraMorph" , but its output is uncontrollable and it makes formation to words which is unwanted. Is there any other stemmer for Arabic ?
Kareem Hashem
  • 121
  • 1
  • 3
10
votes
2 answers

Lemmatization with apache lucene

I'm developing a text analysis project using apache lucene. I need to lemmatize some text (transform the words to their canonical forms). I've already written the code that makes stemming. Using it, I am able to convert the following sentence The…
Kirill Simonov
  • 8,257
  • 3
  • 18
  • 42
10
votes
3 answers

stemming library in java

Is there any library for stemming in java!?
Maverick
  • 2,738
  • 24
  • 91
  • 157
10
votes
4 answers

is there is any stemmer available for indian language

is there is any implementation of stemmers for indian languages like(hindi,telugu) are available ....
rajesh
  • 1,773
  • 3
  • 12
  • 6
9
votes
2 answers

How to configure stemming in Solr?

I add to solr index: "American". When I search by "America" there is no results. How should schema.xml be configured to get results? current configuration:
user657009
  • 722
  • 2
  • 6
  • 18
9
votes
2 answers

Lemmatizing Italian sentences for frequency counting

I would like to lemmatize some Italian text in order to perform some frequency counting of words and further investigations on the output of this lemmatized content. I am preferring lemmatizing than stemming because I could extract the word meaning…
TPPZ
  • 4,447
  • 10
  • 61
  • 106
9
votes
2 answers

Getting the basic form of the english word

I am trying to get the basic english word for an english word which is modified from its base form. This question had been asked here, but I didnt see a proper answer, so I am trying to put it this way. I tried 2 stemmers and one lemmatizer from…
Gunjan
  • 2,775
  • 27
  • 30
8
votes
4 answers

Stop word removal in Javascript

HI I am looking for a library that'll remove stop words from text in Javascript, my end goal is to calculate tf-idf and then convert the given document into vector space, and all of this is Javascript. Can anyone point me to a library that'll help…
dhaval2025
  • 317
  • 2
  • 5
  • 12
1
2
3
35 36