0

I'm trying to calculate some percentages with Elasticsearch but I have a (small) problem. I want ES to calculate the following: "(wins / Total) * 100".

So I added:

"bucket_script": {
      "buckets_paths": {
        "total": "TotalStatus",
        "wins": "TotalWins"
      },
      "script": " (total/ wins) * 100"
}

To my ES request, which looks like:

{
  "size": 0,
  "query": {
    "bool": {
      "must": [
        {
          "query_string": {
            "query": "*",
            "analyze_wildcard": true
          }
        }
      ],
      "must_not": []
    }
  },
  "aggs": {
   "status": {
      "terms": {
        "field": "status.raw"
      }
    },
    "wins": {
      "terms": {
        "field": "status.raw",
        "include": {
          "pattern": "Accepted|Released|Closed"
        }
      }
    },
    "losses": {
      "terms": {
        "field": "status.raw",
        "include": {
          "pattern": "Rejected"
        }
      }
    },
     "TotalStatus": {
      "sum_bucket": {
        "buckets_path": "status._count"
      }
    },
   "TotalWins": {
      "sum_bucket": {
        "buckets_path": "wins._count"
      }
    },
   "TotalLosses": {
      "sum_bucket": {
        "buckets_path": "losses._count"
      }
    }
  }
}

This however returns the following error:

{
  "error": {
    "root_cause": [
      {
        "type": "parsing_exception",
        "reason": "Could not find aggregator type [buckets_paths] in [bucket_script]",
        "line": 54,
        "col": 28
      }
    ],
    "type": "parsing_exception",
    "reason": "Could not find aggregator type [buckets_paths] in [bucket_script]",
    "line": 54,
    "col": 28
  },
  "status": 400
}

Any idea's?

Rick van Lieshout
  • 2,276
  • 2
  • 22
  • 39

1 Answers1

0

I played a lot with bucket_script but I guess it might not be possible as it can't be top level aggregation and also you would need both total_wins and total_status coming from same parent aggregation with one numeric value and I think it might not be possible.

But it can be solved by scripted metric aggregation

{
  "size": 0,
  "aggs": {
    "win_loss_ratio": {
      "scripted_metric": {
        "init_script": "_agg['win_count'] = 0; _agg['total_count'] = 0; _agg['win_status']=['Accepted','Released','Closed'];",
        "map_script": "if (doc['status.raw'].value in _agg['win_status']) { _agg['win_count']+=1};if (doc['status.raw'].value) { _agg['total_count']+=1}",
        "combine_script": "return [_agg['win_count'],_agg['total_count']];",
        "reduce_script": "total_win = 0; total_status_count=0; for (a in _aggs) { total_win += a[0]; total_status_count += a[1] }; if(total_win == 0) {return 0} else {return (total_status_count/total_win) * 100}"
      }
    }
  }
}
  • init_script initializes three variables. win_status array has all the values corresponding to win status.
  • map_script iterates through every document, if the status.raw value is in win_status then win_count is incremented and if it has any value at all total_count is incremented(you could remove this if condition if you also want to include null values)
  • combine_script gets all values per shard
  • reduce_script sums all the values and then divides it. There is also a check so that we dont divide by zero or script will throw exception.
ChintanShah25
  • 12,366
  • 3
  • 43
  • 44
  • I've filed a [similar bug](https://github.com/elastic/elasticsearch/pull/19863) and my PR is not merged yet, but you might get ahead simply by defining `"params": {"_agg": {}}` in your `script_metric` aggregation. – Val Nov 07 '16 at 09:31
  • I just tried this and unfortunately it doesn't work either. Here's the image: http://imgur.com/a/8Brqx – Rick van Lieshout Nov 07 '16 at 10:51
  • Oh, It worked on 2.x. I will test on 5.0 and get back to you. – ChintanShah25 Nov 07 '16 at 14:45
  • Thanks, much appreciated :) You're probably good to go on the docker front but in case you need it I have a git repo for installing docker on a linux vm with logstash ES5 and kibana: https://github.com/Mastermindzh/Scripts/tree/master/docker – Rick van Lieshout Nov 07 '16 at 14:49