0

I want to match documents satisfying all the conditions below:

  1. author == "tom"
  2. status != "deleted"
  3. at least two of f1-f4 fields match given values

(all fields are keyword)

{"size":24,
"query":{
  "bool":{
    "filter":[{"term":{"author":{"value":"tom","boost":1.0}}}],
    "must_not":[{"term":{"status":{"value":"deleted","boost":1.0}}}],
    "should":[
      {"term":{"f1":{"value":"v1","boost":1.0}}},
      {"term":{"f2":{"value":"v2","boost":1.0}}},
      {"term":{"f3":{"value":"v3","boost":1.0}}},
      {"term":{"f4":{"value":"v4","boost":1.0}}}
      ],
      "minimum_should_match":"2",
      "boost":1.0
  }}
}

UPDATE & SUMMARY

The query I post above is in fact correct, but my es provider installed a buggy custom plugin performing "query optimization" which leads to all "minimum_should_match" ignored. If you encounter the same problem and can't find any clue, maybe you should check if you have any suspicious plugin installed

ProtossShuttle
  • 1,623
  • 20
  • 40
  • everything in the update looks fine, i cant seem to replicate your test case, any chance you add `explain: true` to your query and see if theres anything suspicious there? maybe the values are index'd improperly due to move from string type? – Tom Slabbaert May 06 '19 at 11:01

1 Answers1

1

You're query is correct, you just need to remove the "adjust_pure_negative" flag or change it to false.

In short elastic will "ignore" all your queries and just filters using the must_not's in the case the flag is set to true. source

also you can remove the boost:1 as the default value is 1 which makes it redundant.

EDIT: my test

    await client.index({index: 'test', id: 5, type: 'test', body: {author: "george", status: "deleted", f1: "v1", f2: "v2"}});
    await client.index({index: 'test', id: 6, type: 'test', body: {author: "george", status: "x", f1: "v1",}});
    await client.index({index: 'test', id: 7, type: 'test', body: {author: "george", status: "u", f1: "v1", f2: "v2"}});
    await client.index({index: 'test', id: 8, type: 'test', body: {author: "george", status: "q", f1: "v1", f4: "v4"}});
    await client.index({index: 'test', id: 9, type: 'test', body: {author: "george", status: "1", f3: "v3"}});
    let x = await client.search({
        index: 'test',
        body:
            {"size":24,
                "query":{
                    "bool":{
                        "filter":[{"term":{"author":{"value":"george","boost":1.0}}}],
                        "must_not":[{"term":{"status":{"value":"deleted","boost":1.0}}}],
                        "must":[{
                            "bool":{
                                "should":[
                                    {"term":{"f1":{"value":"v1","boost":1.0}}},
                                    {"term":{"f2":{"value":"v2","boost":1.0}}},
                                    {"term":{"f3":{"value":"v3","boost":1.0}}},
                                    {"term":{"f4":{"value":"v4","boost":1.0}}}],
                                "minimum_should_match":"2",
                                "adjust_pure_negative":false,
                                "boost":1.0}}
                        ],
                        "adjust_pure_negative":false,
                        "boost":1.0}}},
    });

results: 2 hits as expected:

[
  {
    "_index": "test",
    "_type": "test",
    "_id": "7",
    "_score": 0.5753642,
    "_source": {
      "author": "george",
      "status": "u",
      "f1": "v1",
      "f2": "v2"
    }
  },
  {
    "_index": "test",
    "_type": "test",
    "_id": "8",
    "_score": 0.47000366,
    "_source": {
      "author": "george",
      "status": "q",
      "f1": "v1",
      "f4": "v4"
    }
  }
]
Tom Slabbaert
  • 21,288
  • 10
  • 30
  • 43
  • Thank you. `adjust_pure_negative` and `boost` are generated by the `SearchSourceBuilder` from the java rest high level client. I tried to set `adjust_pure_negative` to `false` manually, but the behavior of the two queries still remained the same. – ProtossShuttle May 06 '19 at 09:56
  • i posted my testing code, take a look at it and tell me if you see any difference between my query and yours. in case you cant please post you mapping of the index (i'm assuming it wont be standard?) and also an example of a "false positive" document – Tom Slabbaert May 06 '19 at 10:12
  • I think it's exactly the same. Please check the update. It's really weird that `minimum_should_match` is not working at all, even in the most simple case – ProtossShuttle May 06 '19 at 10:24
  • I have finally found the true cause. The ES provider from another team in my company installed a custom plugin performing 'query optimization', which lead to all 'minimum_should_match` clauses ignored. It's a severe bug. – ProtossShuttle May 07 '19 at 07:45