I have an index that has a mapping which is similar to
{
"id": {
"type": "long"
},
"start": {
"type": "date"
},
"end": {
"type": "date"
}
}
I want to create a date histogram so that each document falls into all buckets which intervals fall between "start" and "end".
Eg. if for one document "start" = 12/01/2018, "end" = 04/25/2019, my date-histogram interval are weeks and the range is now-1y until now. I now want the document to fall into every bucket starting the week of 12/01/2018 until the week of 04/25/2019. So with just this one document the result should be 52 buckets where the buckets April to dezember have doc_count 0 and the buckets Dezember to April have doc_count 1.
As I see it date-histogram only gives me the option to match my document to exactly one bucket depending on one field, either "start" or "end".
What I have tried so far:
- Dynamically generate a query with 52 filters which checks if a document falls into this "bucket"
- Try to make use of painless scripts in each query
Both solutions were extremly slow. I am working with around 200k documents and such queries took around 10 seconds.
EDIT: Here is a sample query that is generated dynamically. As can be seen one filter is created per week. This query takes about 10 seconds which is way to long
%{
aggs: %{
count_chart: %{
aggs: %{
last_seen_over_time: %{
filters: %{
filters: %{
"2018-09-24T00:00:00Z" => %{
bool: %{
must: [
%{range: %{start: %{lte: "2018-09-24T00:00:00Z"}}},
%{range: %{end: %{gte: "2018-09-17T00:00:00Z"}}}
]
}
},
"2018-12-24T00:00:00Z" => %{
bool: %{
must: [
%{range: %{start: %{lte: "2018-12-24T00:00:00Z"}}},
%{range: %{end: %{gte: "2018-12-17T00:00:00Z"}}}
]
}
},
"2019-04-01T00:00:00Z" => %{
bool: %{
must: [
%{range: %{start: %{lte: "2019-04-01T00:00:00Z"}}},
%{range: %{end: %{gte: "2019-03-25T00:00:00Z"}}}
]
}
}, ...
}
}
}
},
size: 0
}
And a sample response:
%{
"_shards" => %{"failed" => 0, "skipped" => 0, "successful" => 5, "total" => 5},
"aggregations" => %{
"count_chart" => %{
"doc_count" => 944542,
"last_seen_over_time" => %{
"buckets" => %{
"2018-09-24T00:00:00Z" => %{"doc_count" => 52212},
"2018-12-24T00:00:00Z" => %{"doc_count" => 138509},
"2019-04-01T00:00:00Z" => %{"doc_count" => 119634},
...
}
}
}
},
"hits" => %{"hits" => [], "max_score" => 0.0, "total" => 14161812},
"timed_out" => false,
"took" => 2505
}
I hope this question is understandable. If not I will explain it more in detail.