1

New to aggregations in elasticsearch. Using 7.2. I am trying to write an aggregation on Tree.keyword to only return the count of documents that have a key that contains the word "Branch". I have tried sub aggregations, bucket_selector (which doesnt work for key strings) and scripts. Anyone have any ideas or suggestions on how to approach this?

Mapping:

{
  "testindex" : {
    "mappings" : {
      "properties" : {
        "Tree" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword"
            }
          }
        }
      }
    }
  }
}

Example Query that returns all the keys but what I need to do is limit to only return keys with "Branch" or better yet just the count of how many "Branch" keys there are:

GET testindex/_search
{
  "aggs": {
    "bucket": {
      "terms": {
        "field": "Tree.keyword"
      }
    }
  }
}

Returns:

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "testindex",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "Tree" : [
            "Car:76",
            "Branch:yellow",
            "Car:one",
            "Branch:blue"
          ]
        }
      }
    ]
  },
  "aggregations" : {
    "bucket" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "Car:76",
          "doc_count" : 1
        },
        {
          "key" : "Branch:yellow",
          "doc_count" : 1
        },
        {
          "key" : "Car:one",
          "doc_count" : 1
        },
        {
          "key" : "Branch:blue",
          "doc_count" : 1
        }
      ]
    }
  }
}

3 Answers3

1

You have to add includes for limit result. Here's the code sample and hopefully this should help you.

 GET testindex/_search
    {
    "_source": {
    "includes": [
      "Branch"
    ]
    },
      "aggs": {
        "bucket": {
          "terms": {
            "field": "Tree.keyword"
          }
        }
      }
    }
DeC
  • 2,226
  • 22
  • 42
0

It is possible to filter the values for which buckets will be created. This can be done using the include and exclude parameters which are based on regular expression strings or arrays of exact values. Additionally, include clauses that can filter using partition expressions.

For your case, it should be like this,

GET testindex/_search
{
  "aggs": {
    "bucket": {
      "terms": {
        "field": "Tree.keyword",
        "include": "Branch:*"
      }
    }
  }
}
A l w a y s S u n n y
  • 36,497
  • 8
  • 60
  • 103
0

Thanks for all the help! Unfortunately, none of those solutions worked for me. I ended up using a script to return all the branches and then setting everything else into a new key. Then used a bucket script to subtract 1 in Total_Buckets. Probably a better solution out there but hopefully it helps someone


GET testindex/_search
{
  "aggs": {
    "bucket": {
      "cardinality": {
        "field": "Tree.keyword",
        "script": {
          "lang": "painless",
          "source": "if(_value.contains('Branches:')) { return _value} return 1;"
        }
      }
    },
    "Total_Branches": {
      "bucket_script": {
        "buckets_path": {
          "my_var1": "bucket.value"
        },
        "script": "return params.my_var1-1"
      }
    }
  }
}