I am indexing Tomcat access-log data into Elasticsearch (1.7.3). The documents that I deal with have the concept of duration, represented as end time and duration in millisec (start time can be calculated, though I can store it as well, if it helps solve my problem). For example:
{
ztime: "10-17-2015T04:05:00.000+02:00",
duration: 4500,
thred: "http-nio-8080-exec-14"
},
{
ztime: "10-17-2015T04:07:42.227+02:00",
duration: 3100,
thred: "http-nio-8080-exec-25"
}
My goal is to produce a histogram where I show for each second how many threads existed.
I thought of using a date_histogram that will aggregate my docs into 1 sec buckets.
GET /mindex/mtype/_search?search_type=count
{
"aggs": {
"threads_per_hr": {
"date_histogram": {
"field": "ztime",
"interval": "1s",
"min_doc_count": 1
},
"aggs": {
"per_hr_threads": {
"cardinality": {
"field": "thread"
}
}
}
}
}
}
however, thus each thread will be bucketized only once.
What I need is for each doc to be bucketized into several buckets. For example, I will need the first document to be bucketized into the 04:05:00.000, 04:05:01.000, 04:05:02.000, 04:05:03.000 buckets.
What kind of query (Java API and/or REST API) would help me achieve this goal?