7

Suppose that my index have two documents:

  1. "foo bar"
  2. "bar foo"

When I do a regular match query for "bar foo", both documents match correctly but they get equal relevance scores. However, I want the order of words to be significant during scoring. In other words, I want "bar foo" to have a higher score.

So I tried putting my match query inside the must clause of a bool query and included a match_phrase (with the same query string) as the should clause. This seems to score hits correctly, until I do a search with "bar test foo". In that case match_phrase query doesn't seem to match, and the hits are returned with equal scores again.

How can I construct my index/query so that it takes word order into account but does not require all searched words to exist in document?

Can
  • 377
  • 2
  • 10
  • I think the CirrusSearch MediaWiki extension does this. Results can be unexpected: https://www.mediawiki.org/wiki/Thread:Help_talk:CirrusSearch/Impact_of_word_order_in_two-words_search_query – Nemo Apr 28 '15 at 17:02

2 Answers2

2

Have a look at SpanNearQuery, it allows specifying order with or without slop (limit of how far the terms should be apart each other).

Elasticsearch documentation is here.

mindas
  • 26,463
  • 15
  • 97
  • 154
0

Take a look at PhraseSearch. You should combine your current search with a PhraseSearch (boost PhraseSearch a bit higher than regular term matching).

Doc: PhraseSearch

Zouzias
  • 2,330
  • 1
  • 22
  • 32