Solr search arabic site, plural and singular words

Question

I am using solr to implement search for an Arabic site, and i want normalize plural words into singular words and visa versa, so searching for "كتاب" gets me any document that contains "كتاب" or "كتب" is this possible in solr, highly appreciate your input

Probably helpful http://stackoverflow.com/questions/10681281/need-explanation-on-language-stemmer-of-solr — cheffe, Jun 20 '16 at 05:33
There is [a package in Lucene](https://lucene.apache.org/core/6_0_0/analyzers-common/index.html?org/apache/lucene/analysis/ar/ArabicNormalizationFilter.html) (and therefore available in Solr) that is dedicated to the Arabic language. Unfortunately I know not enough about Arabic to tell you, if this is what you need. — cheffe, Jun 20 '16 at 05:35

score 1 · Answer 1 · answered Jun 20 '16 at 06:50

1

There was a presentation at the Lucene/Solr Revolution by Ramzi Alqrainy about Solr's support for Arabic and common issues. It is now available online.

answered Jun 20 '16 at 06:50

Alexandre Rafalovitch

9,709
1
24
27

Thanks, i did what was presented but did not satisfy my requiment – Moon123 Jun 20 '16 at 08:32

Assem · Answer 2 · 2016-06-26T11:31:04.790

1

You need a stemmer to bring words into their origins in both indexing/search, try Khoja's stemmer or Assem's stemmer

Solr by default is using a light stemmer for Arabic but seems you needs a deep stemmer that brings roots of the word.

edited Jun 26 '16 at 11:31

answered Jun 20 '16 at 09:04

Assem

11,574
5
59
97

Thanks a lot, i already tried to create my own custom Arabic stem filter, but do not know what the correct implementation of the incrementToken() method would be, highly appreciate your input.. – Moon123 Jun 26 '16 at 11:25

Solr search arabic site, plural and singular words

2 Answers2