0

Below configuration for Elasticsearch:

  1. 1 Cluster
  2. 1 Node
  3. 1 Index
  4. 3 Shards (1 Replica shard for each primary, but in UNASSIGNED state as there is only 1 node).

I have indexed document and those are spread across 3 Shards(Shard-0, Shard-1,Shard-2).

Term Aggregation I am trying:

POST myIndex/_search
{
  "query": {"match_all": {}}, 
  "size":0,
  "aggs": {
    "products": {
      "terms": {
        "field": "BillToID",
        "size": 10,
        "shard_size": 11,
        "show_term_doc_count_error": true
      }
    }
  }
}

Response :

"aggregations" : {
    "products" : {
      "doc_count_error_upper_bound" : 7,
      "sum_other_doc_count" : 12,
      "buckets" : [
        {
          "key" : "ProductA",
          "doc_count" : 100,
          "doc_count_error_upper_bound" : 6
        },
        {
          "key" : "ProductC",
          "doc_count" : 54,
          "doc_count_error_upper_bound" : 6
        },
        {
          "key" : "ProductZ",
          "doc_count" : 52,
          "doc_count_error_upper_bound" : 6
        },
        {
          "key" : "ProductG",
          "doc_count" : 47,
          "doc_count_error_upper_bound" : 6
        },
        {
          "key" : "ProductH",
          "doc_count" : 44,
          "doc_count_error_upper_bound" : 6
        },
        {
          "key" : "ProductB",
          "doc_count" : 43,
          "doc_count_error_upper_bound" : 6
        },
        {
          "key" : "ProductE",
          "doc_count" : 31,
          "doc_count_error_upper_bound" : 6
        },
        {
          "key" : "ProductF",
          "doc_count" : 19,
          "doc_count_error_upper_bound" : 6
        },
        {
          "key" : "ProductI",
          "doc_count" : 11,
          "doc_count_error_upper_bound" : 6
        },
        {
          "key" : "ProductJ",
          "doc_count" : 9,
          "doc_count_error_upper_bound" : 6
        }
      ]
    }
  }

From Defination in Docs Of Per Bucket doc_count_error_upper_bound =

This is calculated by summing the document counts for the last term returned by all shards which did not return the term.

Problem : But When I checked I can see ProductA has been returned by each shard, so why does it shows "doc_count_error_upper_bound" : 6 for ProductA?

Any help is much appreciated:)

Nishikant Tayade
  • 483
  • 3
  • 12
  • Calculation of doc_count_error_upper_bound is explained quite well here: https://stackoverflow.com/questions/37513634/what-is-the-significance-of-doc-count-error-upper-bound-in-elasticsearch-and-how – ilvar Jan 13 '22 at 15:12
  • @ilvar, As I have mentioned, I know how the calculation should happen, but in the above case it is not happening as described in docs, I have posted the same on elastic forum, they have confirmed that its valid issue and will get back. – Nishikant Tayade Jan 17 '22 at 04:44

0 Answers0