2

Is that a possible to define an aggregation function in elastic search?

E.g. for data:

author weekday status
me     monday  ok
me     tuesday ok
me     moday  bad

I want to get an aggregation based on author and weekday, and as a value I want to get concatenation of status field:

agg1 agg2    value
me   monday  ok,bad
me   tuesday ok

I know you can do count, but is that possible to define another function used for aggregation?

EDIT/ANSWER: Looks like there is no multirow aggregation support in ES, thus we had to use subaggregations on last field (see Akshay's example). If you need to have more complex aggregation function, then aggregate by id (note, you won't be able to use _id, so you'll have to duplicate it in other field) - that way you'll be able to do advanced aggregation on individual items in each bucket.

Mikl X
  • 1,199
  • 11
  • 17

1 Answers1

2

You can get get roughly what you want by using sub aggregations available in 1.0. Assuming the documents are structured as author, weekday and status, you could using the aggregation below:

{
  "size": 0,
  "aggs": {
    "author": {
      "terms": {
        "field": "author"
      },
      "aggs": {
        "days": {
          "terms": {
            "field": "weekday"
          },
          "aggs": {
            "status": {
              "terms": {
                "field": "status"
              }
            }
          }
        }
      }
    }
  }
}

Which gives you the following result:

{
   ...
   "aggregations": {
      "author": {
         "buckets": [
            {
               "key": "me",
               "doc_count": 3,
               "days": {
                  "buckets": [
                     {
                        "key": "monday",
                        "doc_count": 2,
                        "status": {
                           "buckets": [
                              {
                                 "key": "bad",
                                 "doc_count": 1
                              },
                              {
                                 "key": "ok",
                                 "doc_count": 1
                              }
                           ]
                        }
                     },
                     {
                        "key": "tuesday",
                        "doc_count": 1,
                        "status": {
                           "buckets": [
                              {
                                 "key": "ok",
                                 "doc_count": 1
                              }
                           ]
                        }
                     }
                  ]
               }
            }
         ]
      }
   }
}
Akshay
  • 3,361
  • 1
  • 21
  • 19
  • Yeah, that's almost what I did. That solution still requires processing of the last portion of buckets on the client side. – Mikl X Feb 28 '14 at 18:33