4

I'm using nested mapping (below), that represents a "task" and has a nested element of "requests" making progress towards that task.

I'm trying to find all tasks which have not made progress, i.e. all documents for which the "max" aggregation on the nested objects is empty. This requires being able to filter on the result of an aggregation - and that's where I'm a bit stuck.

I can order by the results of the aggregation. but I can't find a way to filter. Is there such a capability?

mapping:

mapping = {
  properties: {
    'prefix' => {
      type: "string",
      store: true, 
      index: "not_analyzed"
    },
    'last_marker' => {
      type: "string", 
      store: true, 
      index: "not_analyzed"
    },
    'start_time' => {
      type: "date", 
      store: true, 
      index: "not_analyzed"
    },
    'end_time' => {
      type: "date", 
      store: true, 
      index: "not_analyzed"
    },
    'obj_count' => {
      type: "long", 
      store: true, 
      index: "not_analyzed"
    },
    'requests' => {
      type: 'nested',
      include_in_parent: true,
      'properties' => {
        'start_time' => {
          type: "date", 
          store: true, 
          index: "not_analyzed"
        },
        'end_time' => {
          type: "date", 
          store: true, 
          index: "not_analyzed"
        },
        'amz_req_id' => {
          type: "string", 
          store: true, 
          index: "not_analyzed"
        },
        'last_marker' => {
          type: "string", 
          store: true, 
          index: "not_analyzed"
        }
      }
    }
  }
}

Ordering by aggregation query (and looking for the filter...):

{
  "size":0,
  "aggs": {
    "pending_prefix": {
      "terms": {
        "field": "prefix",
        "order": {"max_date": "asc"},
        "size":20000
      },
      "aggs": {
        "max_date": {
          "max": {
            "field": "requests.end_time"
          }
        }
      }
    }
  }
}
Saeed Zhiany
  • 2,051
  • 9
  • 30
  • 41
aabes
  • 208
  • 1
  • 12
  • What do you mean by "documents for which the "max" aggregation on the nested objects is empty" . For this aggregation to give empty value , woulnt it be required for requests.end_time field to be empty for that entire document ? If so cant you search for documents which dont contain the field requests.end_time – Vineeth Mohan Dec 23 '14 at 21:52
  • Try to rid of `store: true` in mapping(at least for fields that you want to aggregate). I had same issue. For some reason Elasticsearch won't do aggregation on fields that have this flag in mapping. Do not forget to reindex your data after mapping will be changed – Dmitry Balabka Jun 09 '15 at 19:44

1 Answers1

1

It is like HAVING clause in SQL terms. It is not possible in current Elasticsearch Release.

In the upcoming 2.0 release, with newly introduced Pipeline Aggregation, it should be possible then.

More: https://www.elastic.co/blog/out-of-this-world-aggregations

Tao Wen
  • 21
  • 6