I'm using Lucene and StandardAnalyzer
for creating indexes in my code, however, there is a problem with 'Yo' and 'Ye' (Ё and Е).
I want search results with 'yo' also yeild results with 'ye', and vise-versa. I tried to create new Analyzer class, similiar to StandartAnalyzer
, with custom filter , but no luck on my side. I'm also well known about RussianAnalyzer
, but it seems it's not working for me, as it treats 'yo' and 'ye' separately.
Here is the chunk, where I'm using this analyzer:
QueryParser queryParser = new QueryParser("myText", new MyAnalyzer());
queryParser.setDefaultOperator(QueryParser.Operator.AND);
After this I do queryParser.parse()
and other query build stuff for searching.
The question is: What is right way to do this operation? Should I use my custom TokenFilter? Or, maybe, my own CharFilter?
Wikipedia links to character in question : https://en.wikipedia.org/wiki/Yo_(Cyrillic) https://en.wikipedia.org/wiki/Ye_(Cyrillic)