I've got a word in the text (e.g. nagymező
) and I want to be able to type in the search query nagymező
or nagymezo
and it should show this text which contains that word in the search results.
How can it be accomplished?
Asked
Active
Viewed 1,662 times
2

We Are All Monica
- 13,000
- 8
- 46
- 72

Sergey Bakumenko
- 21
- 3
2 Answers
0
You want to use a Unicode folding strategy, probably the asciifolding
filter. I'm not sure which version of Elasticsearch you're on, so here are a couple of documentation links:
asciifolding
for ES 2.x (older version, but much more detailed guide)asciifolding
for ES 6.3

We Are All Monica
- 13,000
- 8
- 46
- 72
-
1Also don't forget to turn on `preserve_original` option in `asciifolding` filter, because it's disabled by default and your search queries will contain words both with diacritics and without it. – Alexey Prudnikov Aug 23 '18 at 08:21
-
For any solution it is worth testing all four possible cases: – We Are All Monica Aug 27 '18 at 01:59
-
(1) search for `esta`, DB contains `esta` – We Are All Monica Aug 27 '18 at 02:00
-
(2) search for `está`, DB contains `esta` – We Are All Monica Aug 27 '18 at 02:00
-
(3) search for `esta`, DB contains `está` – We Are All Monica Aug 27 '18 at 02:00
-
(4) search for `está`; DB contains `está` – We Are All Monica Aug 27 '18 at 02:00
-2
The trick is to remove the diacritics when you index them so they don't bother you anymore.
Have a look at ignore accents in elastic search with haystack and also at https://www.elastic.co/guide/en/elasticsearch/guide/current/custom-analyzers.html (look for 'diacritic' on the page).
Then, just because it will probably be useful to someone one day or the other, know that the regular expression \p{L} will match any Unicode letter :D
Hope this helps,

Romain Prévost
- 513
- 2
- 12