0

I am trying to paginate 50 data at once in aggregation, so i gave it a try with below code.

 "aggs": {
            "source_list": {
                "terms": {
                    "field": "source.keyword",
                    "from": 0,
                    "size": 50,
                },
            },
        },

This sounded pretty straight forward but instead i hit rock bottom with it, by the following error.

{"detail":"RequestError(400, 'x_content_parse_exception', '[1:59] [terms] unknown field [from]')"}
TrickOrTreat
  • 821
  • 1
  • 9
  • 23
  • Does this answer your question? [Aggregation + sorting +pagination in elastic search](https://stackoverflow.com/questions/27776582/aggregation-sorting-pagination-in-elastic-search) – Ashraful Islam Apr 23 '20 at 05:09
  • The correct way of "paginating" through terms buckets, is by using the composite aggregation: https://stackoverflow.com/a/54800209/4604579 – Val Apr 23 '20 at 06:46

3 Answers3

1

Pagination in aggregation not supported in Elasticsearch

Since only size is supported, you have to remove the param from from aggs query. If the total size of the buckets is reasonable then just increase the value of the size to max. Otherwise you could try partitioning the aggregation.

For example :

"aggs": {
    "source_list": {
        "terms": {
            "field": "source.keyword",
            "size": 50,
            "include": {
                "partition": 0,
                "num_partitions": 10
            }
        },
    },
}
  • Pick a value for num_partitions to break the number up into more manageable chunks
  • Pick a size value for the number of responses we want from each partition

Source : Elasticsearch filtering values with partitions

Ashraful Islam
  • 12,470
  • 3
  • 32
  • 53
0

You can only do pagination on your return results, not in aggregation:

{
      "query": {
          ....
      },
      "from":0
      "size":50,
      "aggs":{
          ....
      }
}
Frank
  • 1,215
  • 12
  • 24
  • I have already tried this, this is giving me only 10 documents – TrickOrTreat Apr 23 '20 at 04:39
  • You mean 10 buckets? Yes by default it returns 10 buckets only. In your request query, remove the `from` and keep `size`, that will return 50 buckets. If you want all buckets, just set size to 0. – Frank Apr 23 '20 at 04:45
  • i dnt need hits, i need aggregation buckets only, thats why i aggregated. I want this aggregation result only, but "from" keyword is not working inside. – TrickOrTreat Apr 23 '20 at 06:10
0

From and size as in query are not available in aggregations

You can use below options to paginate through aggegations:-

  1. Composite Aggregation: can combine multiple datasources in a single buckets and allow pagination and sorting on it. It can only paginate linearly using after_key i.e you cannot jump from page 1 to page 3. You can fetch "n" records , then pass returned after key and fetch next "n" records.
GET index22/_search
{
 "size": 0,
 "aggs": {
   "pagination": {
     "composite": {
       "size": 1,
       "sources": [
         {
           "source_list": {
             "terms": {
               "field": "sources.keyword"
             }
           }
         }
       ]
     }
   }
 }
}

Result:

"aggregations" : {
    "pagination" : {
      "after_key" : {
        "source_list" : "a" --> used to fetch next records linearly
      },
      "buckets" : [
        {
          "key" : {
            "source_list" : "a"
          },
          "doc_count" : 1
        }
      ]
    }
  }

To fetch next record

{
  "size": 0,
  "aggs": {
    "pagination": {
      "composite": {
        "size": 1,
        "after": {"source_list" : "a"}, 
        "sources": [
          {
            "source_list": {
              "terms": {
                "field": "sources.keyword"
              }
            }
          }
        ]
      }
    }
  }
}
  1. Include partition: group's the field’s values into a number of partitions at query-time and processing only one partition in each request. Term fields are evenly distributed in different partitions. So you must know number of terms beforehand. You can use cardinality aggregation to get count
GET index/_search
{
  "size": 0,
  "aggs": {
    "source_list": {
      "terms": {
        "field": "sources.keyword",
        "include": {
          "partition": 1,
          "num_partitions": 3
        }
      }
    }
  }
}

  1. Bucket Sort aggregation : sorts the buckets of its parents multi bucket aggreation. Each bucket may be sorted based on its _key, _count or its sub-aggregations. It only applies to buckets returned from parent aggregation. You will need to set term size to 10,000(max value) and truncate buckets in bucket_sort. You can paginate using from and size just like in query. If you have terms more that 10,000 you won't be able to use it since it only selects from buckets returned by term.
GET index/_search
{
  "size": 0,
  "aggs": {
    "source_list": {
      "terms": {
        "field": "sources.keyword",
        "size": 10000 --> use large value to get all terms
      },
      "aggs": {
        "my_bucket": {
          "bucket_sort": {
            "sort": [
              {
                "_key": {
                  "order": "asc"
                }
              }
            ],
            "from": 1, 
            "size": 1
          }
        }
      }
    }
  }
}

In terms of performance composite aggregation is a better choice

jaspreet chahal
  • 8,817
  • 2
  • 11
  • 29