This is more really a question of what you expect the output to look like, as any aggregated result essentially needs to group at the lowest level and then progressively group at higher "grains" until the largest level ( "month" ) is reached. This kind of implies data grouped by "month" ultimately, unless you break it down otherwise.
In essence, progressively $group
:
db.collection.aggregate([
// First total per day. Rounding dates with math here
{ "$group": {
"_id": {
"$add": [
{ "$subtract": [
{ "$subtract": [ "$createdAt", new Date(0) ] },
{ "$mod": [
{ "$subtract": [ "$createdAt", new Date(0) ] },
1000 * 60 * 60 * 24
]}
]},
new Date(0)
]
},
"week": { "$first": { "$week": "$createdAt" } },
"month": { "$first": { "$month": "$createdAt" } },
"total": { "$sum": "$num" }
}},
// Then group by week
{ "$group": {
"_id": "$week",
"month": { "$first": "$month" },
"days": {
"$push": {
"day": "$_id",
"total": "$total"
}
},
"total": { "$sum": "$total" }
}},
// Then group by month
{ "$group": {
"_id": "$month",
"weeks": {
"$push": {
"week": "$_id",
"total": "$total",
"days": "$days"
}
},
"total": { "$sum": "$total" }
}}
])
So each level after the first which sums up per day is then progressively pushed into array content for it's "round up" value and the totals are then summed at that level as well.
If you want a flatter output with one record per day containing it's weekly and monthly totals as well as the day total, then just append two $unwind
statements to the end of the pipeline:
{ "$unwind": "$weeks" },
{ "$unwind": "$weeks.days" }
And optionally $project
the "dotted" fields out to something flatter and readable if you must.
If you are spanning "years" with this, then include such an operation in the grouping key at least from the "week" level so you are not possibly combining data from different years and they are separated.
It is also my own general preference to use the "date math" approach when rounding dates as it returns a Date
object, but as is used at the other levels than "day", you can just alternately use the date aggregation operators instead.
No need for mapReduce
as this is fairly intuitive and there is a finite amount of days in a month that means the BSON limit when nesting arrays in the content while aggregating will not be broken.