0

I have a phrase which I want to find in SolR for example: (Ann OR Annie) is walking her dog. I want to be able to find it in SolR documents like:

  1. Ann is walking a dog (changed token)
  2. Ann is walking dog (missing token)
  3. Ann is walking her wonderful dog (additional token).

First one can be done (more or less) with usage of ComplexPhraseQueryParser with for example (her OR a) (but it is not perfect as I might not now the alternatives) and it works fine for third type with usage of proximity ~, but it won't work at all for the second type of query as one of tokens is missing.

The second and third one can be achieved by eDisMax with combination of minimum match and ps2 and ps3, but they won't work for the variability needed in Ann OR Annie as they would parse the whole query as OR, so the document which has Ann AND Annie would have better score than the one with only one of them (I want to treat them equally). And I am still not sure if it is working well when searched words (Ann and Annie) are in the same position in Solr (increment=0).

The perfect solution would be something like ComplexPhraseQueryParser with minimum match. Is there a possibility to achieve that only by query or do I have to create my own parser?

Filu
  • 41
  • 1
  • 5
  • Had you taken in account the use of `solr.SynonymGraphFilterFactory` and `solr.StopFilterFactory` in your analyser chain? – freedev May 15 '17 at 08:50
  • As I understand `solr.SynonymGraphFilterFactory` uses altrernatives which I define by myself as synonyms so it won't help if I didn't define some of them. Also `solr.StopFilterFactory` would only remove (ignore) the defined stop words but I do not know which token is (or might be) missing. I do not see how they can help me with this task. – Filu May 15 '17 at 11:51

0 Answers0