0

I'm trying to use LangDetectLanguageIdentifierUpdateProcessorFactory that comes with SOLR to detect languages when indexing documents. It looks pretty straightforward implementation, i have put following to solrconfig.xml

<updateRequestProcessorChain>
  <processor class="org.apache.solr.update.processor.LangDetectLanguageIdentifierUpdateProcessorFactory">
    <str name="langid.fl">title</str>
    <str name="langid.langField">language_s</str>
    <str name="langid.fallback">en</str>
    <bool name="langid.map">true</bool>
    <bool name="langid.map.individual">true</bool>
    <str name="langid.map.individual.fl">title</str>
    <str name="langid.whitelist">en, fr, de, it, ar, ja, zh-cn, zh-tw</str>
    <bool name="langid.map.keepOrig">true</bool>
    </processor>
    <processor class="solr.LogUpdateProcessorFactory" />
    <processor class="solr.RunUpdateProcessorFactory" />
 </updateRequestProcessorChain> 

but when i start solr it says cannot load class LangDetectLanguageIdentifierUpdateProcessorFactory. I had also tried TikaLanguageIdentifierUpdateProcessorFactory but no luck. I probably missing something. Do I need any additional package/library/classes to have multi language detection functionality in SOLR?

femtoRgon
  • 32,893
  • 7
  • 60
  • 87
rusho1234
  • 241
  • 2
  • 12

1 Answers1

1

make sure you have the apache-solr-langid-X.X.jar and the dependant jars in contrib/langid/lib available for solr.

Jayendra
  • 52,349
  • 4
  • 80
  • 90