So the first thing really to cover is your current "dates" are all "strings", and this really does not help. You would be better served converting everything to BSON Date, since that is basically what is required in the aggregation operations anyway.
Second point is that getting the totals for "each day" within an interval is not easy. In fact you really need to throw some expressions at MongoDB in order to do such a thing:
db.collection.aggregate([
{ "$addFields": {
"start": { "$toDate": "$start.day" },
"end": { "$toDate": "$end.day" },
"date": { "$toDate": "$date" },
"dayworking": {
"$map": {
"input": {
"$range": [
0,
{ "$ceil": {
"$divide": [
{ "$subtract": [
{ "$toDate": "$end.day" },
{ "$toDate": "$start.day" }
]},
1000 * 60 * 60 * 24
]
}}
]
},
"in": {
"$toDate": {
"$add": [
{ "$multiply": ["$$this", 1000 * 60 * 60 * 24 ] },
{ "$subtract": [
{ "$toLong": { "$toDate": "$start.day" } },
{ "$mod": [ { "$toLong": { "$toDate": "$start.day" } }, 1000 * 60 * 60 * 24 ] }
]}
]
}
}
}
}
}},
{ "$unwind": "$dayworking" },
{ "$group": {
"_id": {
"employee": "$employee",
"day": "$dayworking"
},
"hours": {
"$sum": {
"$floor": {
"$divide": [
{ "$switch": {
"branches": [
{
"case": {
"$and": [
{ "$lt": [ "$dayworking", "$start" ] },
{ "$gt": [
{ "$add": [ "$dayworking", 1000 * 60 * 60 * 24 ] },
"$end"
]}
]
},
"then": { "$subtract": [ "$end", "$start" ] }
},
{
"case": {
"$lt": [
"$end",
{ "$add": [ "$dayworking", 1000 * 60 * 60 * 24 ] }
]
},
"then": {
"$subtract": [ "$end", "$dayworking" ]
}
},
{
"case": { "$lt": [ "$dayworking", "$start" ] },
"then": {
"$subtract": [
{ "$add": [ "$dayworking", 1000 * 60 * 60 * 24 ] },
"$start"
]
}
}
],
"default": 1000 * 60 * 60 * 24
}},
1000 * 60 * 60
]
}
}
}
}},
{ "$sort": { "_id": 1 } }
])
Basically returns every day within the start and end intervals as (truncating for brevity):
{
"_id" : {
"employee" : ObjectId("5adc864eaa3c92b3c4c252c1"),
"day" : ISODate("2018-04-22T00:00:00Z")
},
"hours" : 15
}
{
"_id" : {
"employee" : ObjectId("5adc864eaa3c92b3c4c252c1"),
"day" : ISODate("2018-04-23T00:00:00Z")
},
"hours" : 24
}
.... each day in between ...
{
"_id" : {
"employee" : ObjectId("5adc864eaa3c92b3c4c252c1"),
"day" : ISODate("2018-05-21T00:00:00Z")
},
"hours" : 24
}
{
"_id" : {
"employee" : ObjectId("5adc864eaa3c92b3c4c252c1"),
"day" : ISODate("2018-05-22T00:00:00Z")
},
"hours" : 14
}
Which is the allocated 24 hours on "whole days" and the partial hours on others. As from your example the first day has data that generates as:
{
"_id" : ObjectId("5adc864eaaf408a2b6e325f7"),
"employee" : ObjectId("5adc864eaa3c92b3c4c252c1"),
"end" : ISODate("2018-04-22T12:06:46.623Z"),
"start" : ISODate("2018-04-22T11:06:46.623Z"),
"date" : ISODate("2018-04-22T11:06:46.623Z"),
"dayending" : ISODate("2018-04-22T00:00:00Z"),
"hours" : 1
}
{
"_id" : ObjectId("5adc864eaaf408a2b6e325c8"),
"employee" : ObjectId("5adc864eaa3c92b3c4c252c1"),
"end" : ISODate("2018-04-22T10:06:46.623Z"),
"start" : ISODate("2018-04-22T08:06:46.623Z"),
"date" : ISODate("2018-04-22T11:06:46.623Z"),
"dayending" : ISODate("2018-04-22T00:00:00Z"),
"hours" : 2
}
{
"_id" : ObjectId("5adc864eaaf408a2b6e325f6"),
"employee" : ObjectId("5adc864eaa3c92b3c4c252c1"),
"end" : ISODate("2018-05-22T12:06:46.623Z"),
"start" : ISODate("2018-04-22T11:06:46.623Z"),
"date" : ISODate("2018-05-22T11:06:46.623Z"),
"dayending" : ISODate("2018-04-22T00:00:00Z"),
"hours" : 12
}
Being the two sole entries and one with a 12 hour remainder which makes up the 15 hours and the last day:
{
"_id" : ObjectId("5adc864eaaf408a2b6e325f6"),
"employee" : ObjectId("5adc864eaa3c92b3c4c252c1"),
"end" : ISODate("2018-05-22T12:06:46.623Z"),
"start" : ISODate("2018-04-22T11:06:46.623Z"),
"date" : ISODate("2018-05-22T11:06:46.623Z"),
"dayending" : ISODate("2018-05-22T00:00:00Z"),
"hours" : 12
}
{
"_id" : ObjectId("5adc864eaaf408a2b6e325c4"),
"employee" : ObjectId("5adc864eaa3c92b3c4c252c1"),
"end" : ISODate("2018-05-22T10:06:46.623Z"),
"start" : ISODate("2018-05-22T08:06:46.623Z"),
"date" : ISODate("2018-05-22T11:06:46.623Z"),
"dayending" : ISODate("2018-05-22T00:00:00Z"),
"hours" : 2
}
Has a 2 hour entry and another 12 hour remainder making 14 in total.
Explanation
Date Conversions and Math
To explain that there are essentially two main things that need to be done outside of the obvious "date conversion". That by the way can be done with $toDate
from MongoDB 4.0 or via $dateFromString
if you have MongoDB 3.6. Noting that in the latter case you also need to apply different methods for the "date math"
There are detailed examples of handling "date math" in earlier MongoDB versions on Group result by 15 minutes time interval in MongoDb in the cases where you have an earlier version and either $dateFromString
or straight conversion of your data is required first.
Projecting the Dates in the Range
The next main part to make this work is you basically need to construct an array of the dates which a document applies to inside the source document. This is what the $range
expression does, by taking a starting value ( 0
in this case ) and an ending value which here we apply to being the "number of days between" the start
and end
date values.
That difference is returned from a $subtract
in milliseconds, so a $divide
is used over the constant milliseconds in a day to get a whole integer. Using $ceil
here to round, but this could just as easily be $mod
with $subtract
where that operator is not available in earlier versions.
At that point the $range
has really just produced an array of integer values, so the $map
is applied to that array in order to convert these to the actual BSON Date objects that should represent the "day" that data applies for. Again it's just some "date math" applying an addition of the array index value ( +1 of course ) to the original rounded starting date.
Calculating the hours
Now with an array of dates from the earlier stage and some other reformatting of document values into usable BSON Dates, you need to actually compare this "array" content with each start
and end
value to determine how many hours applied in that day.
The first basic case and why we actually created an array for this is to use $unwind
, which effectively copies the result document for every day occurring within the interval. It's a small but important step that must happen before you $group
and actually count things. Bottom line is the $group
will actually use those values as part of it's "primary key" for output, and also comparison to other date information.
Of course the real work here is all being done in the $switch
statement, which again can just as well be a "nested" use of $cond
in earlier versions. Here you basically want to analyse the three possible cases and of course the default fallback for the "full day".
The cases basically are:
Where the current "grouping day" is less than the start
AND the "following day" would be greater than the end
date, simply subtract the difference.
If not the above, then when the end
date is less than the "following day" for grouping then subtract that "grouping day" from the current end
date to get hours of that start of day until the end
time.
If not the above, then when the "grouping day" is less than the start
( without the other end
condition from earlier ) then the hours worked would be from the "following day" subtracting the start
for the difference from start
to the end of the current day.
If those were not true then default to showing a "whole day", which is being shown as 24 hours in this example.
If you had other working hours to apply, then it's just a matter of adjusting that in, i.e "start of day" +8 hours for an 8 a.m start. The same thing basically applies to "end of day" by adding something like +17 for a 5.pm finish. But the basic principles of the logic to implement are still the same as shown above.
NOTE: The main constraint here is $range
, which I believe was from MongoDB 3.0 or possibly 3.2. At any rate you probably really should not be running any version of MongoDB prior to 3.4 at this present time.
If you did have an earlier version, then there are some more details on Group and count over a start and end range with another earlier answer of mine showing a similar process using multiple queries and even mapReduce()
in addition to a very similar $range
example.