Highest Voted 'stemming' Questions

0

votes

1 answer

R Textmining: How to perform typical textoperations with tm Package on vectors

How, to operate following standard operations, on a character vector? (Need a dictionary for a DTM (classification). So in order to match the text entries, where this operations were already been made, i have to change the my dictionary terms…

asked Feb 23 '14 at 03:37

alex

1,103
1
14
25

0

votes

1 answer

PorterStemmer in Lucene

I am looking for help on how I can use the class PorterStemFilter in Lucene 4.0. Below is my indexer taken from http://www.lucenetutorial.com/lucene-in-5-minutes.html: ... StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_40); …

lucene indexing stemming porter-stemmer

asked Feb 21 '14 at 21:32

user2161903

577
1
6
22

0

votes

1 answer

Find only exact matches of a particular (exception) word

I looking for a way to configure Solr so that it only finds exact matches for a particular word and works the normal way for other words. One possible way that comes to mind is to configure the stemmer's synonyms list so that this word is mapped to…

solr stemming

asked Jan 29 '14 at 12:54

axk

5,316
12
58
96

0

votes

0 answers

Get all word forms used in mysql full text search

I am using full text search feature of mysql for searching through comments. To use stemming, I am using "form of" in the query. This gives me the correct result, returning all comments having the any word form of the search text. However, I need to…

mysql full-text-search stemming

asked Dec 26 '13 at 08:08

Gyanendra Singh

895
2
13
30

0

votes

0 answers

r - DocumentTermMatrix control parameters

I am trying to build a SVM model on a text corpus. For this I built DocumentTermMatrix with following control parameters: control <- list(stopwords = TRUE, removePunctuation = TRUE, removeNumbers = TRUE, …

r stemming

asked Nov 19 '13 at 00:55

CHEMBETI ARAVIND

91
1
11

0

votes

1 answer

Stemming + stop word filtering in Lucene 4.0+

I used to use SnowBallAnalyzer to combine custom stop word filtering with basic stemming, but it has been deprecated. For e.g. in index config, I could easily specify: IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_32, …

lucene stemming

asked Aug 30 '13 at 00:56

abhinavkulkarni

2,284
4
36
54

0

votes

1 answer

lucene stemmer strategy (does it keep both stemmed & non-stemmed words or just stemmed ones)

I have a question regarding lucene Stemmer. I was wondering if lucene keeps both stemmed words and non-stemmed words OR just replaces the stemmed word with the non-stemmed words? for example if a record has following: "everyone loves cats" does it…

lucene stemming

asked Jun 20 '13 at 17:23

Mr.Boy

615
1
7
13

0

votes

1 answer

SOLR stemming and stopwords

In SOLR 3.5 text field type the StopFilterFactory is listed before the PorterStemFilterFactory. does this mean that if I wanted to stop for example "game" and "games" I would have to add both to stopwords? if so would moving the StopFilterFactory…

solr stemming stop-words

asked Jun 12 '13 at 10:11

dice

2,820
1
23
34

0

votes

0 answers

Code works in VS2010 but not in VC++ 6.0

I'm working on a project, in which I'm using stemming library which works quite perfectly on Visual Studio 2010 (Express) but when I tried to compile the same project in VC++ 6.0, it generated errors. I fixed a few of them but I'm stuck at some.…

c++ visual-c++ iterator stemming

asked May 25 '13 at 22:29

Sajjad Ahmad

1
1

0

votes

0 answers

Customizing the output of Stemming

I've been using Snowball Porter2 for stemming. I don't want the root form as output.For eg., the Porter2 produces "Emergenc" after stemming "Emergencies". I want "Emergency" instead. Will someone please point direction how to achieve the result? The…

java stemming

asked May 15 '13 at 12:40

nexuscreator

835
1
9
17

0

votes

2 answers

lexical-level similarity word clustering tool

Is there any open software toolkit that compares the lexcial-level similarities among words and group similar words together? For example, Blue jean, Blue jeans, and blue jea (miss-spelled) should be grouped together? I don't need to look for…

machine-learning nlp text-mining stemming

asked Apr 01 '13 at 12:46

walkman

109
2
3
8

0

votes

1 answer

Is there a port for KStem for .NET?

I'm about to launch into a Lucene.NET implementation and I am concerned about using the PorterStemFilter. Reading here, and reading source code, it appears to be far, far too aggressive for my needs. I need something simpler that doesn't look for…

lucene.net stemming

asked Mar 15 '13 at 23:02

Kevin

1,829
1
21
22

0

votes

0 answers

Getting the root of an Arabic word

I have a Python code that take an Arabic word and get the root and also remove diacritics, but I have a problem with the output. For example: when the input is "العربيه" the output is:"عرب" but when the input is "كاتب" the output is:"ب", and when…

python nlp arabic stemming

asked Mar 01 '13 at 23:16

user2091683

47
3

0

votes

2 answers

Using stemming in a SOLR query

I've set up SOLR, and added a document to the example 'collection1'. 3007WFP Fishing Ladies I can query it ok in the interface using name:*fishing* but I…

solr stemming

asked Feb 20 '13 at 14:55

finoutlook

2,523
5
29
43

0

votes

1 answer

Searching in Lucene .Net

I have used Lucene .Net for Indexing and using StandardAnalyzer to at time of Indexing. Now I want to search say 'attach'. In document 'attached' is there. How i get the successful hit for word 'attach'. Please help me as soon as possible.

lucene.net stemming

asked Sep 29 '09 at 06:19

Ashish

Questions tagged [stemming]