1

I have many values in one field, when I do an aggregations, I receive these values as separate values.

Exemple :

name : jess , Region : new york 
name : jess , Region : poland

request :

  query = {
        "size": total,
        "aggs": {
        "buckets_for_name": {
            "terms": {
                 "field": "name",
                 "size": total
             },
            "aggs": {
                "region_terms": {
                    "terms": {
                        "field": "region",
                        "size": total
                    }
                }
            }
        }
        }
    }

with response["aggregations"]["buckets_for_name"]["buckets"] i get :

 {'key': 'jess ', 'doc_count': 61, 'region_terms': {'doc_count_error_upper_bound': 0, 'sum_other_doc_count': 0, 'buckets': [{'key': 'oran', 'doc_count': 60}, {'key': 'new ', 'doc_count': 1}, {'key': 'york', 'doc_count': 1}]}}, {'key': 'jess ', 'doc_count': 50, 'egion_terms': {'doc_count_error_upper_bound': 0, 'sum_other_doc_count': 0, 'buckets': [{'key': 'poland', 'doc_count': 50}]}}

with

pretty_results = []
for result in response["aggregations"]["buckets_for_name"]["buckets"]:
    d = dict()
    d["name"] = result["key"]
    d["region"] = []
    for region in result["region_terms"]["buckets"]:
        d["region "].append(region ["key"])
        pretty_results.append(d)
        print(d)

i get :

{'name': 'jess ', 'region ': ['new' , 'york', 'poland']}

I want to get this result:

{'name': 'jess ', 'region ': ['new york', 'poland']}

1 Answers1

2

The region (and I presume name) fields were analyzed using the standard analyzer which rendered new york to be split into the tokens [new, york].

What you may want to do is set up a keyword mapping to treat the strings as standalone tokens:

PUT regions
{
  "mappings": {
    "properties": {
      "name": {
        "type": "text",
        "fielddata": true,
        "fields": {
          "keyword": {
            "type": "keyword"
          }
        }
      },
      "region": {
        "type": "text",
        "fielddata": true,
        "fields": {
          "keyword": {
            "type": "keyword"
          }
        }
      }
    }
  }
}

After that, perform your aggs on the .keyword fields:

{
  "size": 200,
  "aggs": {
    "buckets_for_name": {
      "terms": {
        "field": "name.keyword",         <---
        "size": 200
      },
      "aggs": {
        "region_terms": {
          "terms": {
            "field": "region.keyword",   <---
            "size": 200
          }
        }
      }
    }
  }
}

If you want to keep newyork space-less, look into the pattern_replace filter within your analyzers.


EDIT from the comments Aggs are not a part of the query -- they have their own scope -- so change this

{
  "query": {
    "aggs": {
      "buckets_for_name": {

to this

{
  "query": {
     // possibly leave the whole query attribute out
   },
   "aggs": {
      "buckets_for_name": {
   ...
Joe - GMapsBook.com
  • 15,787
  • 4
  • 23
  • 68
  • I got a`[ ]` as a result ! – Enya Ece Arzu Aug 05 '20 at 16:33
  • i use `PUT mytable/_mapping { "properties": { "name": { "type": "text", "fielddata": true, "fields": { "keyword": { "type": "keyword" } } }, "region": { "type": "text", "fielddata": true, "fields": { "keyword": { "type": "keyword" } } } } }` – Enya Ece Arzu Aug 05 '20 at 16:35
  • what does `GET mytable/_search` return? Can you paste it here? – Joe - GMapsBook.com Aug 05 '20 at 17:01
  • `{ "took": 53, "timed_out": false,"_shards": { "total": 1,"successful": 1, "skipped": 0, "failed": 0 }, "hits": { "total": { "value": 2395, "relation": "eq" }, "max_score": 1, "hits": [ { "_index": "mytable", "_type": "_doc", "_id": "7769", "_score": 1, "_source": { "id_mytable": 7769, "date": "2020-07-21" " region": "poland", "name": "AdminDB", } }, ` – Enya Ece Arzu Aug 05 '20 at 17:11
  • If you look closely, the `region` field name has a leading space: " region". Fix this and it'll work! – Joe - GMapsBook.com Aug 05 '20 at 17:31
  • i got this error `{ "error": { "root_cause": [ { "type": "parsing_exception", "reason": "unknown query [aggs]", "line": 3, "col": 11 } ], "type": "parsing_exception", "reason": "unknown query [aggs]", "line": 3, "col": 11, "caused_by": { "type": "named_object_not_found_exception", "reason": "[3:11] unknown field [aggs]" } }, "status": 400 }` – Enya Ece Arzu Aug 05 '20 at 17:36
  • my query : `GET /mytable/_search { "query" : { "aggs": { "buckets_for_name": { "terms": { "field": "name.keyword", "size": 200 }, "aggs": { "region_terms": { "terms": { "field": "region.keyword", "size": 200 } } } } } } }` – Enya Ece Arzu Aug 05 '20 at 17:39
  • Updated my answer. – Joe - GMapsBook.com Aug 05 '20 at 18:14
  • I got `{ "error": { "root_cause": [ { "type": "illegal_argument_exception", "reason": "query malformed, empty clause found at [3:4]" } ], "type": "illegal_argument_exception", "reason": "query malformed, empty clause found at [3:4]" }, "status": 400 }` – Enya Ece Arzu Aug 05 '20 at 18:24
  • Aggs should be *outside* of the `query` statement. And it must not be empty either... Post your full query w/ the aggs & query and I'll fix it for ya. – Joe - GMapsBook.com Aug 05 '20 at 18:32
  • `GET /mytable/_search { "size": 222, "aggs": { "buckets_for_name": { "terms": { "field": "name.keyword", "size": 2222 }, "aggs": { "region_terms": { "terms": { "field": "region.keyword", "size": 222 } } } } } }` – Enya Ece Arzu Aug 05 '20 at 18:47
  • with python code: ` total = mytableDocument.search().count() query = { "size": total, "aggs": { "buckets_for_name": { "terms": {"field": "name.keyword","size": total }, "aggs": { "egion_terms": { "terms": { "field": "region.keyword", "size": total } } } } } } es = Elasticsearch() response = es.search(index= "mytable",body=query,size=total) ` – Enya Ece Arzu Aug 05 '20 at 18:50
  • `response["aggregations"]["buckets_for_name"]["buckets"]` didn't return anything, that's my problem – Enya Ece Arzu Aug 05 '20 at 19:00
  • Investigate `response`, then `response["aggregations"]` and go from there.... – Joe - GMapsBook.com Aug 05 '20 at 19:22