0

So I am trying to search across nested objects in ElasticSearch and I am not doing something correctly as I get no results.

I run the following commands:-

Create Index and Mappings

PUT /demo
{
    "mappings": {
        "person": {
            "properties": {
                "children": {
                    "type": "nested",
                        "properties": {
                            "fullName": {
                                "type": "string"
                            },
                            "gender": {
                                "type": "string",
                                "index": "not_analyzed"
                        }
                    }
                }
            }
        }
    }
}

Add person document

POST /demo/person/1
{
    "children": [{
        "fullName" : "Bob Smith",
        "gender": "M"
    }]
}

These all execute as expected. However, when I come to search on them as outlined in the documentation I do not get any results.

Query

POST /demo/person/_search
{
    "query": {
        "bool": {
            "must": [{
                "match_all": {}
            },
            {
                "nested": {
                "path": "children",
                "query": {
                    "bool": {
                        "must": [{
                            "match": {
                                "fullName": "Bob Smith"
                            }
                        }]
                    }
                }
                }
            }]
        }
    }
}

What am I doing incorrectly?

baynezy
  • 6,493
  • 10
  • 48
  • 73

1 Answers1

3

Just to record the answer, the issue is that all queries and filters need the full field name. In the above example, the document is indexed as:

{
  "children": [
    {
      "fullName" : "Bob Smith",
      "gender": "M"
    }
  ]
}

To query gender, it must be accessed as children.gender and to query fullName, it must be queried as children.fullName.

All JSON data structures are effectively flattened by Lucene, which is actually the entire reason the nested type even exists, so:

{
  "children": [
    {
      "fullName" : "Bob Smith",
      "gender": "M"
    },
    {
      "fullName" : "Jane Smith",
      "gender": "F"
    }
  ]
}

becomes this with object type (the default):

"children.fullName": [ "Bob Smith", "Jane Smith" ]
"children.gender": [ "M", "F" ]

with nested type it becomes:

{
  "children.fullName": [ "Bob Smith" ]
  "children.gender": [ "M" ]
}
{
  "children.fullName": [ "Jane Smith" ]
  "children.gender": [ "F" ]
}

where the {} serve as nested document boundaries (they're not literally there, but logically they are).

As such, whether you're using nested documents or not, you need to supply the full path to the field name, even if the last part (e.g., gender) is unique to the index.

Of related interest: you should never use nested type when you only have a single object in the array. It's only useful when you actually use it as an array. If it's not an array, then the flat version serves the exact same function with less overhead. If some of the documents have a single, but some have more than one, then it also makes sense to use nested.

pickypg
  • 22,034
  • 5
  • 72
  • 84