10

I'm looking for a Java stemmer for Arabic. I found a lib called "AraMorph" , but its output is uncontrollable and it makes formation to words which is unwanted.

Is there any other stemmer for Arabic ?

paradigmatic
  • 40,153
  • 18
  • 88
  • 147
Kareem Hashem
  • 121
  • 1
  • 3

5 Answers5

8

Here is new Arabic stemmer: Assem's Arabic light stemmer coded using Snowball framework and generated to many languages including Java. You can use it by downloading libstemmer for Java here.

Assem
  • 11,574
  • 5
  • 59
  • 97
6

You can find Kohja stemmer here:

http://zeus.cs.pacificu.edu/shereen/research.htm

Direct download:

http://zeus.cs.pacificu.edu/shereen/ArabicStemmerCode.zip

paradigmatic
  • 40,153
  • 18
  • 88
  • 147
  • Thank you for your answer, @paradigmatic. I have asked my quesion because I did not know what stemming is. Following your answer I read about it a little bit. – AlexR Jul 11 '11 at 20:41
  • I want API or Lib so that I can use in my project Thanks any way :) – Kareem Hashem Jul 11 '11 at 21:04
  • @Kareem: It is an API or lib... Check the second link I've posted. – paradigmatic Jul 13 '11 at 07:08
  • The code is under the GPL license though. Quite restrictive for a stemming library. Can't be used in a commercial product. I can't even wrap it in a TokenFilter and contribute it to Lucene, since Apache License and GPL are incompatible. – Basil Musa Feb 10 '16 at 15:44
  • It can be used in a web-based commercial product, SaaS for example, This way no redistributables are involved, so GPL has no problems with that! – Ahmed Rezk Nov 09 '17 at 19:17
3

https://sourceforge.net/projects/arabicstemmer/

try this it is based on Shereen Khoja Algorithm.

Faris
  • 875
  • 1
  • 9
  • 18
1

You can use either Elkhoja stemmer or Lucene's light stemmer

ahmed elbagoury
  • 315
  • 5
  • 13
1

after digging I found the best solution is to implement my own stemmer using porter Algorithm so that I can tune my stemmer

Jim Ferrans
  • 30,582
  • 12
  • 56
  • 83
Kareem Hashem
  • 1,057
  • 1
  • 9
  • 25
  • 4
    What? It won't work! Arabic is typed in non-Latin letters and more importantly follows extremely different algorithmic approach than the other Latin languages. .... But I'm interested to know if it worked with you or not? – Omar Al-Ithawi Aug 13 '12 at 12:30