1

I have the following dataset

    [
      {
        "rating": "10",
        "subject": "maths"
      },
      {
        "rating": "9",
        "subject": "physics"
      },
      {
        "rating": "10",
        "subject": "chemistry"
      },
      {
        "rating": "5",
        "subject": "physics"
      },
      {
        "rating": "2",
        "subject": "geography"
      },
      {
        "rating": "5",
        "subject": "maths"
      },
      {
        "rating": "1",
        "subject": "geography"
      },
      {
        "rating": "5",
        "subject": "maths"
      }, 
      {
        "rating": "8",
        "subject": "chemistry"
      }
    ] 

What I need to do is find the avg rating for each subject, and then calculate the # of subjects in ranges of rating (0-2,2-5,5-8,8-10) with an elastic search query.

The query I have so far creates buckets for each subject calculating the avg of each bucket. But I can't find how to do a range aggregation on the result of the composite aggregation. Is it even possible? Is there an alternative?

Here is my query that buckets the data according to the subject and calculates the avg rating.

GET kibana_sample/_search
{
  "size":0,
    "aggs" : {
        "my_buckets": {
            "composite" : {
                "sources" : [
                    { "subject": { "terms" : { "field": "subject" } } }
                ]
            },
            "aggs": {
              "avg_rating": {
                "avg" : { "field" : "rating" }
              }
            }
         }
     }
} 

It results in the following.

"aggregations": {
    "my_buckets": {
      "buckets": [
        {
          "key": {
            "subject": "maths"
          },
          "doc_count": 3,
          "avg_rating": {
            "value": 6.66666667
          }
        },
        {
          "key": {
            "subject": "physics"
          },
          "doc_count": 2,
          "avg_rating": {
            "value": 7
          }
        },
        {
          "key": {
            "subject": "chemistry"
          },
          "doc_count": 2,
          "avg_rating": {
            "value": 9
          }
        },
        {
          "key": {
            "subject": "geography"
          },
          "doc_count": 2,
          "avg_rating": {
            "value": 1.5
          }
        }
      ]
   }
}

It's all good, but now I need to perform a range aggregation on top of this result to get the number of subjects in ranges of ratings eg:

ratings range: {0-2}: 1 subject, {2-5}: 0 subjects, {5-8}: 2 subjects, {8-10}: 1 subject

ben
  • 181
  • 1
  • 12
  • One way I know would work is by using a pipeline aggregation as in this answer: https://stackoverflow.com/a/50981267/4604579, a bit cumbersome but it'd work. – Val Jan 09 '19 at 06:50
  • Thanks Val. I'm going to try that and test the performance impact of that. Did you ever open a feature request for bucket_range? – ben Jan 09 '19 at 16:20

1 Answers1

0

You can use pipeline aggregations in order to concat one aggregation results through more aggregations. Another thing you can do is use scripts in the pipeline, in order to filter only the relevant results.

Check out for the scripts examples here: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-pipeline.html