4

I'm using "Composite Aggregation" as my use case requires "Pagination" on buckets returned by Aggregation, as the number of returned buckets could be huge.

I also need to run "Composite aggregation" on a specific time range, I'm using a "range" filter for this. However, the response from elasticsearch does not contain any data in "aggregations". "hits" array does contain proper data as per filter. I'm only interested in the result of "Composite Aggregation".

Am I missing something?

Request is as below :

GET <my-index>/_search 
{
  "query" : {
    "range" : {
      "@timestamp" : {
        "gte" : "2020-03-29T14:53:42.068Z",
        "lt" : "2020-03-29T15:53:42.068Z"
      }
    }
  },
  "aggs" : {
    "uniq_userids": {
      "composite" : {
        "size": 100,
        "sources" : [
          { "by_userid": { "terms" : { "field": "userid.keyword" } } }
        ]
      }
    } 
  } 
}

Response is as below :

{
  "took" : 10,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 12,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
          ...
    ]
  },
  "aggregations" : {
    "uniq_userids" : {
      "buckets" : [ ]
    }
  }
}

Expected output in aggregations element

  "aggregations" : {
    "uniq_userids" : {
      "after_key" : {
        "by_userid" : "user4"
      },
      "buckets" : [
        {
          "key" : {
            "by_userid" : "user1"
          },
          "doc_count" : 3
        },
        {
          "key" : {
            "by_userid" : "user2"
          },
          "doc_count" : 3
        },
        {
          "key" : {
            "by_userid" : "user3"
          },
          "doc_count" : 2
        }
      ]
    }
  }

Example of a document

      {
        "_index": "xxxxxxxxxxx",
        "_type": "doc",
        "_id": "fpFcKXEB9-HOO02nOoEG",
        "_score": 1,
        "_source": {
          "message": "xxxxxxxxxxx",
          "input": {
            "type": "log"
          },
          "tags": [
            "beats_input_codec_plain_applied"
          ],
          "offset": 56422597,
          "logday": "2020-03-29",
          "userid": "user1",
          "source": "xxxxxxxxxxx",
          "@version": "1",
          "prospector": {
            "type": "log"
          },
          "Micro_time": "2020-03-29 14:54:01.366719",
          "logtime": "14:54:01.366719",
          "@timestamp": "2020-03-29T14:54:01.366Z"
        }
      }
codeseeker
  • 85
  • 2
  • 5
  • 1
    { "from":"2020-03-29T14:00:00.000Z" "to":"2020-03-29T14:00:00.000Z" } --->both the dates are same , is it correct. Can you add sample data which should come in aggregation result – jaspreet chahal Mar 31 '20 at 02:48
  • Shouldn't time filtering be in query? – Johnny Mar 31 '20 at 03:17
  • @jaspreetchahal, Thanks for pointing out. I pasted a wrong query which I tried earlier. I have corrected it now. Please refer to the corrected query in the post. I have pasted the expected value of aggregation as well. Thanks ! – codeseeker Mar 31 '20 at 04:39
  • @Johnny, Thanks for poiiting out. It was a typo. Have corrected it now. – codeseeker Mar 31 '20 at 04:39
  • @codeseeker can you add document where timestamp between 2020-03-29T14:53:42.068Z and 2020-03-29T15:53:42.068Z Your query looks fine. Need document as it could be timezone issue. also can you add your mapping – jaspreet chahal Mar 31 '20 at 04:44
  • Looks correct, make sure your `userid` has a keyword type. – Johnny Mar 31 '20 at 05:26
  • @Johnny, I used "userid.keyword" as value for "field" in "terms" aggregation – codeseeker Mar 31 '20 at 05:51
  • @jaspreetchahal, I have added an example of a document – codeseeker Mar 31 '20 at 06:44
  • add your mapping – Johnny Mar 31 '20 at 06:50
  • I tried with your document result is appearing in aggregation . I think userid.keyword doesn't exist , check your mapping – jaspreet chahal Mar 31 '20 at 07:08
  • 1
    I debugged again. The issue was in logstash patterns file. Grok pattern initially was storing userid in "usrid" field. The data for 2020-03-29 was created with this field. Later I updated "usrid" to "userid" in Grok patterns and updated ES query as well to "userid". Since 2020-03-29 data had no "userid", no data was returned by ES in response. Thanks for your time jaspreet, Johnny ! – codeseeker Mar 31 '20 at 10:59

0 Answers0