0

I have a very particular issue concerning querying over a boolean field and a string field which are nested to an array field. The index mapping is as follow:

indexes :string_field_1, type: 'string'
indexes :string_field_2, type: 'string'
indexes :boolean_field_1, type: 'boolean'
indexes :array_field_1 do
           indexes :boolean_field_2, type: 'boolean'
           indexes :string_field_3, type: 'string'
end
indexes :array_field_2 do
           indexes :integer_field_1, type: 'integer'
end
indexes :array_field_3 do
           indexes :integer_field_2, type: 'integer'
end

The document index also has many other fields which are not nested to the array field, but have to be included among the query fields. I have tried an approach using filter and bool queries that is as follow:

"query":
        {"bool":
                {"must":
                        [
                                {"query_string":
                                        {"query":"text which is being searched",
                                        "fields":[
                                                "string_field_1",
                                                "string_field_2",
                                                "array_field_1.string_field_3"
                                                ],
                                        "fuzziness":"1","analyze_wildcard":true,"auto_generate_phrase_queries":false,"analyzer":"brazilian","default_operator":"AND"}
                                }
                        ],
                        "filter":[
                                {"bool":
                                        {"must":
                                                [
                                                        {"bool":
                                                                {"should":
                                                                        [
                                                                                {"term":{"boolean_field_1":false}},
                                                                                {"terms":{"array_field_2.integer_field_1":[x,z]}},
                                                                                {"term":{"array_field_3.integer_field_2":y}}]}},
                                                        {"bool":
                                                                {"should":
                                                                        [
                                                                                {"term":{"array_field_1.boolean_field_2":true}},
                                                                                {"terms":{"array_field_2.integer_field_1":[x,z]}},
                                                                                {"term":{"array_field_3.integer_field_2":y}}]}},
                                                                        ]
                                                                }
                                                        }
                                                ]
                                        }
                                }
                        ]
                }
}

The problem with this query is that it is returning a document which, in my opinion, doesn't have to be returned. The document, in this case, is the bellow:

_source": {
    "string_field_1": "text 1",
    "string_field_2": "text 2",
    "boolean_field_1": false, 
    "array_field_1": [
        {
            "boolean_field_2": true,
            "string_field_3": "some text which is not being searched"
        },
        {
            "boolean_field_2": true,
            "string_field_3": "some text which is not being searched"
        },
        {
            "boolean_field_2": false,
            "string_field_3": "text which is being searched"
        },
        {
            "boolean_field_2": true,
            "string_field_3": "some text which is not being searched"
        }
    ],
    "array_field_2": [
        {
            "integer_field_1": A
        }
    ],
    "array_field_3": [
        {
            "integer_field_2": B
        }
    ]
}

As you can notice, the third item of array_field_1 contains boolean_field_2: false and also the text which is being searched. But, according to my filter: clause, only the documents which array_field_1.boolean_field_2 is true have to be retrieved, unless array_field_2.integer_field_1: or array_field_3.integer_field_1 occurs, which is not true, according to my query part. It seems elastic is not considering that the array_field_1[2] is the one that the boolean_field_2 is false. How can I make my query so that this document isn't retrieved?

Thanks is advance, Guilherme

2 Answers2

0

Another approach consists of putting the array_field_1.string_field_3 query together with the bool query related to the boolean field:

"query":{
    "bool":{
        "should":
        [
            {
                "query_string":
                    {
                        "query":"text which is being searched",
                        "fields":
                            [
                                "string_field_1",
                                "string_field_2"
                            ],
                            "fuzziness":"1","analyze_wildcard":true,"auto_generate_phrase_queries":false,"analyzer":"brazilian","default_operator":"AND"
                    }
            },
            {
                "bool":{
                    "must":
                    [
                        {
                            "query_string":
                            {
                                "query":"text which is being searched",
                                "fields":["array_field_1.string_field_3"],
                                "fuzziness":"1","analyze_wildcard":true,"auto_generate_phrase_queries":false,"analyzer":"brazilian","default_operator":"AND"
                            }
                        },
                        {
                            "bool":{
                                "should":
                                [
                                    {"term":{"array_field_1.boolean_field_2":true}},
                                    {"terms":{"array_field_2.integer_field_1":[x,z]}},
                                    {"term":{"array_field_3.integer_field_2":y}}
                                ]
                            }
                        }
                    ]
                }
            }
        ],
        "filter":
        [
            {
                "bool":{
                    "should":
                    [
                        {"term":{"boolean_field_1":false}},
                        {"terms":{"array_field_2.integer_field_1":[x,z]}},
                        {"term":{"array_field_3.integer_field_2":y}}
                    ]
                }
            }
        ]
    }
}

This query also retrieves the document, unfortunately. I really do not know how to build this query properly.

The query above is organized as: (X) OR (A AND (B OR C OR D))

Termininja
  • 6,620
  • 12
  • 48
  • 49
0

That was my solution:

"query":{
    "bool":{
        "should":
        [
            {
                "query_string":
                    {
                        "query":"text which is being searched",
                        "fields":
                            [
                                "string_field_1",
                                                       "string_field_2"
                            ],
                            "fuzziness":"1","analyze_wildcard":true,"auto_generate_phrase_queries":false,"analyzer":"brazilian","default_operator":"AND"
                    }
            },
            {
                 bool: {
                                   should:[
                                       {
                                           query:{
                                               nested: {
                                                   path: 'array_field_1',
                                                   query: {
                                                       bool: {
                                                           must: [
                                                               { match: { "array_field_1.string_field_3": "text which is being searched"} },
                                                               {term: {"array_field_1.boolean_field_2": true}}
                                                           ]
                                                       }
                                                  }
                                              }
                                          }
                                       },
                                       {
                                          bool:
                                          {
                                            must: [
                                             {
                                                     query:{
                                                         nested: {
                                                             path: 'movimentos',
                                                             query: {
                                                                 bool: {
                                                                     must: [
                                                                         { match: { "array_field_1.string_field_3": "text which is being searched"} },
                                                                         {term: {"array_field_1.boolean_field_2": false
                                                                     ]
                                                                 }
                                                             }
                                                         }
                                                     }
                                                },
                                                {
                                                  query: {
                                                    bool: {
                                                            should: [
                                                              {"terms":{"array_field_2.integer_field_1":[x,z]}},
                                                              {"term":{"array_field_3.integer_field_2":y}}
                                                            ]
                                                        }
                                                      }
                                                }
                                              ]
                                          }
                                       }
                                   ]
                               }
        }
    ]
    }
}
Termininja
  • 6,620
  • 12
  • 48
  • 49