0

TLDR : This use case uses name / value inside a nested object for faceted filters and aggregations.

1- Why is it better than a simple field:value in the root of the document without nested field ?

2- Can you give me a use case that a simple field:value can't handle ?

3- Why is this so complicated to do a faceted aggregation in this stackoverflow question ? What am I missing ?

Remark : I know the key value inside nested architecture allows to have a really limited number of fields (the limit by default is 1000 on elasticsearch), but it's not a concern in my case because I know I am going to have a limited number of fields. Is there an other benefit than that ?

Details

Let's try and make a faceted aggregation on an index with only field: value at the root. This is the car index :

PUT /car
{

"settings": {
    "analysis": {
      "normalizer": {
        "govz_normalizer": {
          "type": "custom",
          "filter": [
            "lowercase",
            "asciifolding"
          ]
        }
      }
    }
  },    
  "mappings": {
    "dynamic_templates": [
        {
          "simple_string": {
            "mapping": {
              "type": "keyword",
              "normalizer": "govz_normalizer"
            },
            "path_match": "simple_string_*"
          }
        },
        {
          "simple_number": {
            "mapping": {
              "type": "double"
            },
            "path_match": "simple_number_*"
          }
        }
      ]
  }
}

This is the seed :

POST car/_doc
{"name": "car_name_1", "simple_string_brand": "Toyota", "simple_string_property": "luxury" }
POST car/_doc
{"name": "car_name_2", "simple_string_brand": "Ford","simple_string_property": "luxury"  }
POST car/_doc
{"name": "car_name_3", "simple_string_brand": "Toyota", "simple_string_property": "sportive"  }
POST car/_doc
{"name": "car_name_4", "simple_string_brand": "Honda", "simple_string_property": "city"  }
POST car/_doc
{"name": "car_name_5", "simple_string_brand": "Toyota", "simple_string_property": "luxury" }

We can see that faceted aggregations work perfectly

GET car/_search
{
    "query": {
        "bool": {
            "should": [
                {
                    "match_all": {}
                }
            ],
            "filter": {
                "bool": {
                    "must": [
                        {
                            "term": {
                                "simple_string_property": "luxury"
                            }
                        },
                        {
                            "term": {
                                "simple_string_brand": "toyota"
                            }
                        }
                    ]
                }
            }
        }
    },
    "aggs": {
        "brand": {
            "terms": {
                "field": "simple_string_brand"
            }
        },
        "property": {
            "terms": {
                "field": "simple_string_property"
            }
        }
    }
}

Result :

{
    "took": 13,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 2,
            "relation": "eq"
        },
        "max_score": 1.0,
        "hits": [
            {
                "_index": "car",
                "_type": "_doc",
                "_id": "75",
                "_score": 1.0,
                "_source": {
                    "name": "car_name_1",
                    "simple_string_brand": "Toyota",
                    "simple_string_property": "luxury"
                }
            },
            {
                "_index": "car",
                "_type": "_doc",
                "_id": "33278",
                "_score": 1.0,
                "_source": {
                    "name": "car_name_5",
                    "simple_string_brand": "Toyota",
                    "simple_string_property": "luxury"
                }
            }
        ]
    },
    "aggregations": {
        "property": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
                {
                    "key": "luxury",
                    "doc_count": 2
                }
            ]
        },
        "brand": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
                {
                    "key": "toyota",
                    "doc_count": 2
                }
            ]
        }
    }
}
misterone
  • 191
  • 2
  • 9
  • What's your question? Everything you asked so far is opinion-based. – Joe - GMapsBook.com Apr 07 '20 at 08:22
  • Sorry I thought that was clear. Do you see a use case that field:value can't handle ? Or an other question is : when to use field:value vs nested name/value ? I am assuming the guys that built this use case did it for a reason. I want to know what I am missing. – misterone Apr 07 '20 at 16:32

0 Answers0