2

My Mongo database has documents as so:

{
   "timestamp": ISODate("2015-09-27T15:28:06.0Z"),
    "value": '123'
},
{
   "timestamp": ISODate("2015-09-27T15:31:06.0Z"),
    "value": '737'
},
{
   "timestamp": ISODate("2015-09-27T15:35:00.0Z"),
    "value": '456'
},
{
   "timestamp": ISODate("2015-09-27T15:40:20.0Z"),
    "value": '789'
}

...etc...

What I want to do is aggregate these in 5 minute intervals and than get the most recent (with the latest timestamp) value per 'group of 5 minutes'.

So basically the steps are:

1) split into groups of 5 minutes

2) return the 5-minute timestamp and the value of the document that has the newest timestamp within this 5 minute group

Based on that and my documents above the documents returned should be:

{
    "timestamp": ISODate("2015-09-27T15:25:00.0Z"),
    "value": '123'
},
{
    "timestamp": ISODate("2015-09-27T15:35:00.0Z"),
    "value": '456' // 456 has a newer timestamp than 737, which are in the same 5 minute range
},
{
    "timestamp": ISODate("2015-09-27T15:40:00.0Z"),
    "value": '789'
}

I have tried grouping into 5 minute intervals as described here: https://stackoverflow.com/a/26814496/1007236

Starting from there I can't find out how to return the value of the most recent within each 5 minute group.

How can I do that?

Community
  • 1
  • 1
Jelmer Keij
  • 1,576
  • 1
  • 19
  • 33

1 Answers1

2

You solve this by a very simple application of Date math:

db.collection.aggregate([
    { "$sort": { "timestamp": 1 } },
    { "$group": {
        "_id": {
            "$add": [
                { "$subtract": [
                    { "$subtract": [ "$timestamp", new Date(0) ] },
                    { "$mod": [
                        { "$subtract": [ "$timestamp", new Date(0) ] },
                        1000 * 60 * 5
                    ]}
                ]},
                new Date(0)
            ]
        },
        "value": { "$first": "$value" }
    }}
])

Where the basic principle is finding the modulo ( $mod ) or "remainder" from the time by a five minute interval and subtracting that from the base time. This rounds to "five minutes".

Of course the other part is you $sort in order to make sure the smallest original "timestamp" sorted "value" is on "top".

The other parts are when you $subtract "epoch" date as a BSON Date from another date then you receive an "integer" in result. The similar part is adding ( $add ) an "integer" to a BSON Date type to receive another BSON Date.

The result is BSON Date objects rounded out to the interval you use with the math.

1000 millisecons X 60 seconds X 5 minutes.

Blakes Seven
  • 49,422
  • 14
  • 129
  • 135
  • Excellent. Thanks again. Is it also possible to return some extra columns in these type of aggregations? For example, say you'd want to return a column 'myColumn' for every document with some custom value? – Jelmer Keij Sep 29 '15 at 15:20
  • 1
    @JelmerKeij If you have more questions them please [ask them](http://stackoverflow.com/questions/ask) as it is the clearest way to express a new question anyway. – Blakes Seven Sep 29 '15 at 23:12