0

I'm facing a problem where I have two documents each containing an array of objects. I like to search for one document containing two properties for a nested object (matching both at the same time in the same object) but I always get both documents.

I created the documents with:

POST /respondereval/_doc
{
  "resp_id": "1236",
  "responses": [
     {"key": "meta","text":"abc"},
     {"key": "property 1", "text": "yes"},
     {"key": "property 2", "text": "yes"},
  ]
}

POST /respondereval/_doc
{
  "resp_id": "1237",
  "responses": [
     {"key": "meta","text":"abc"},
     {"key": "property 1", "text": "no"},
     {"key": "property 2", "text": "yes"},
  ]
}

I defined an index for them to prevent ES to flat out the objects like this:

PUT /respondereval
{
  "mappings" : {
    "properties": {
      "responses" : {
        "type": "nested"
      }
    }
  }
}

I now like to search for the first document (resp_id 1236) with the following query:

GET /respondereval/_search
{
  "query": {
    "nested": {
      "path": "responses",
      "query": {
        "bool": {
          "must": [
            { "match": { "responses.key": "property 1" } },
            { "match": { "responses.text": "yes" } }
          ]
        }
      }
    }
  }
}

This should only return one element which matches both conditions at the same time.

Unfortunatly, it always returns both documents. I assume it's because at some point, ES still flattens the values in the nested objects arrays into something like this (simplified):

resp_id 1236: "key":["gender", "property 1", "property 2"], "text:["abc", "yes", "yes"]
resp_id 1237: "key":["gender", "property 1", "property 2"], "text:["abc", "no", "yes"]

which both contain the property1 and yes.

What is the correct way to solve this so that only documents are returned which contains an element in the objects array which matches both conditions ("key": "property 1" AND "text": "yes") at the same time?

Alex
  • 1,857
  • 3
  • 36
  • 51

2 Answers2

2

The problem is with your mapping. You have text mapping which uses standard analyser by default.

Standard analyzer creates tokens on whitespaces. So

property 1 will be tokenised as

{
    "tokens": [
        {
            "token": "property",
            "start_offset": 0,
            "end_offset": 8,
            "type": "<ALPHANUM>",
            "position": 0
        },
        {
            "token": "1",
            "start_offset": 9,
            "end_offset": 10,
            "type": "<NUM>",
            "position": 1
        }
    ]
}

Similarly property 2 also.

Hence both the documents are returned.

And when you search for yes, it matched from second text in the second document. property 1 matches property analysed token of second key in the document.

To make it work: - use keyword variation

{
  "query": {
    "nested": {
      "path": "responses",
      "query": {
        "bool": {
          "must": [
            { "match": { "responses.key.keyword": "property 1" } },
            { "match": { "responses.text.keyword": "yes" } }
          ]
        }
      }
    }
  }
}

It would be proper:

{
  "query": {
    "nested": {
      "path": "responses",
      "query": {
        "bool": {
          "must": [
            { "match_phrase": { "responses.key": "property 1" } },//phrase queries
            { "match": { "responses.text": "yes" } }
          ]
        }
      }
    }
  }
}
Gibbs
  • 21,904
  • 13
  • 74
  • 138
  • Thanks for the response +1. Your `keyword` example works as expected BUT what would be the correct way to resolve the mapping problem so that I can use the `responses.key` without the `keyword`? – Alex Aug 03 '20 at 17:55
  • This makes sense. Thanks a lot. – Alex Aug 03 '20 at 18:01
0

Have you directly tried the must query without nested.path

{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "responses.key": "property 1"
          }
        },
        {
          "match": {
            "responses.text": "yes"
          }
        }
      ]
    }
  }
}
Prathap Reddy
  • 1,688
  • 2
  • 6
  • 18
  • I thought it's similar to [this answer](https://stackoverflow.com/a/44586170/13907595) as per my knowledge. Isn't it similar to the above question? – Prathap Reddy Aug 03 '20 at 17:52
  • 1
    It does not yield any results. @Gibbs answer is correct. – Alex Aug 03 '20 at 17:56