0

in Lucene 5, Filter is deprecated in favor of ConstantQuery wrapping the normal query object. I came across a case where the "translated" query object from the old filter object does not work as I expected.

val directory = new RAMDirectory()
val config = new IndexWriterConfig(new KeywordAnalyzer())
val writer = new IndexWriter(directory, config)
writer.addDocument({
  val document = new Document()
  document.add(new StringField("k", "v1", Field.Store.YES))
  document.add(new StringField("k", "v2", Field.Store.YES))
  document
})
writer.addDocument({
  val document = new Document()
  document.add(new StringField("k", "v1", Field.Store.YES))
  document.add(new StringField("k", "v3", Field.Store.YES))
  document
})
writer.commit()

val reader = DirectoryReader.open(directory)
val searcher = new IndexSearcher(reader)

val filter =
  new BooleanQuery.Builder().add(
    new BooleanQuery.Builder()
      .add(new ConstantScoreQuery( new TermQuery( new Term("k", "v1") ) ), BooleanClause.Occur.MUST)
      .add(new ConstantScoreQuery( new TermQuery( new Term("k", "v2") ) ), BooleanClause.Occur.MUST_NOT)
      .build()
    ,
    BooleanClause.Occur.MUST_NOT
  ).build()

Console.println("filter: " + filter)
val results = searcher.search(filter, Int.MaxValue)
Console.println("# results: " + results.totalHits)

val filter2 = new BooleanFilter()

filter2.
  add({
    val inner = new BooleanFilter()
    inner add(new TermFilter(new Term("k", "v1")), BooleanClause.Occur.MUST)
    inner add(new TermFilter(new Term("k", "v2")), BooleanClause.Occur.MUST_NOT)
    inner
  }, BooleanClause.Occur.MUST_NOT)

Console.println("filter2: " + filter2)
val results2 = searcher.search(new MatchAllDocsQuery(), filter2, Int.MaxValue)
Console.println("# results2: " + results2.totalHits

Output in the console is,

filter: -(+ConstantScore(k:v1) -ConstantScore(k:v2))
# results: 0
filter2: BooleanFilter(-BooleanFilter(+k:v1 -k:v2))
# results2: 1

From my perspective, I think filter and filter2 should work the same in Lucene 5, but obviously the result tells otherwise. What did I do wrong ?

Sheng
  • 1,697
  • 4
  • 19
  • 33

1 Answers1

0

The answer seems to come from this SO Post,

Weird Solr/Lucene behaviors with boolean operators

Quoted as follows,

Boolean queries must have at least one "positive" expression (ie; MUST or SHOULD) in order to match. Solr tries to help with this, and if asked to execute a BooleanQuery that does contains only negatived clauses at the topmost level, it adds a match all docs query (ie: *:*)

If the top level BoolenQuery contains somewhere inside of it a nested BooleanQuery which contains only negated clauses, that nested query will not be modified, and it (by definition) an't match any documents -- if it is required, that means the outer query will not match.

So in brief, I think I have to add a MatchAllDocsQuery to the BooleanQuery.Builder so that there is at least one MUST or SHOULD clause in place to make the query actually match something (otherwise there always would be nothing). filter modified as below does the trick.

val filter =
  new BooleanQuery.Builder().add(
    new BooleanQuery.Builder()
      .add(new ConstantScoreQuery( new TermQuery( new Term("k", "v1") ) ), BooleanClause.Occur.MUST)
      .add(new ConstantScoreQuery( new TermQuery( new Term("k", "v2") ) ), BooleanClause.Occur.MUST_NOT)
      .build()
    ,
    BooleanClause.Occur.MUST_NOT
  ).add(new MatchAllDocsQuery(), BooleanClause.Occur.SHOULD).build()
Community
  • 1
  • 1
Sheng
  • 1,697
  • 4
  • 19
  • 33