0

I'm working on configuring my core solr that save brazilian portuguese data.

About accents, I need to query something like:

  search   |   return
computação | computacao
computacao | computação

What I need basicly is, with or without accent in a query, return both type of words

I tried:

<charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/>
<filter class="solr.ASCIIFoldingFilterFactory"/>

Without success

I'm using Solr 5.2.1

  • The MappingCharFilterFactory with the "mapping-ISOLatin1Accent.txt" file contain the mappings for the characters 'ç' and 'ã' you give as example, so these should already work. Are you applying the filter on both the query and the index analyzer in the fieldtype config? – spyk Aug 13 '15 at 17:33

2 Answers2

1

Try by adding the BrazilianStemFilterFactory as a filter for your field type which used for searching the field.

This is specifically written for the Brazilian Portuguese. This could solve your issue.

Abhijit Bashetti
  • 8,518
  • 7
  • 35
  • 47
  • 1
    Be aware that stemming changes tokens into their stems, and will make words match that previously didn't. It might be what you want, but it's not related to actual normalizing of accents - just a side effect of the stemmer. – MatsLindh Aug 09 '15 at 22:26
0

When using a multilingual index what I have done is create a new field for each language that uses the language specific text field.

So let's say you have English and Portuguese and thus you would declare two fields:

  1. descriptionPt and use text_pt
  2. descriptionEn and use text

Now when you run your search you would specify which field you would like to use or both via qf and specify deftype=edismax.

Worked fine for me.

xmorera
  • 1,933
  • 3
  • 20
  • 35