I have the following documents in my collection. Each document contains historical weather data about a specific location:
{
'location':'new york',
'history':[
{'timestamp':1524542400, 'temp':79, 'wind_speed':1, 'wind_direction':'SW'}
{'timestamp':1524548400, 'temp':80, 'wind_speed':2, 'wind_direction':'SW'}
{'timestamp':1524554400, 'temp':82, 'wind_speed':3, 'wind_direction':'S'}
{'timestamp':1524560400, 'temp':78, 'wind_speed':4, 'wind_direction':'S'}
]
},
{
'location':'san francisco',
'history':[
{'timestamp':1524542400, 'temp':80, 'wind_speed':5, 'wind_direction':'SW'}
{'timestamp':1524548400, 'temp':81, 'wind_speed':6, 'wind_direction':'SW'}
{'timestamp':1524554400, 'temp':82, 'wind_speed':7, 'wind_direction':'S'}
{'timestamp':1524560400, 'temp':73, 'wind_speed':8, 'wind_direction':'S'}
]
},
{
'location':'miami',
'history':[
{'timestamp':1524542400, 'temp':84, 'wind_speed':9, 'wind_direction':'SW'}
{'timestamp':1524548400, 'temp':85, 'wind_speed':10, 'wind_direction':'SW'}
{'timestamp':1524554400, 'temp':86, 'wind_speed':11, 'wind_direction':'S'}
{'timestamp':1524560400, 'temp':87, 'wind_speed':12, 'wind_direction':'S'}
]
}
I would like to get a list of the most recent weather data for each location (more or less) like so:
{
'location':'new york',
'history':{'timestamp':1524560400, 'temp':78, 'wind_speed':4, 'wind_direction':'S'}
},
{
'location':'san francisco',
'history':{'timestamp':1524560400, 'temp':73, 'wind_speed':8, 'wind_direction':'S'}
},
{
'location':'miami',
'history':{'timestamp':1524560400, 'temp':87, 'wind_speed':12, 'wind_direction':'S'}
}
I was pretty sure it needed some sort of $group aggregate but can't figure out how to select an entire object by $max:<field>
. For example the below query only returns the max timestamp itself, without any of the accompanying fields.
db.collection.aggregate([{
'$unwind': '$history'
}, {
'$group': {
'_id': '$name',
'timestamp': {
'$max': '$history.timestamp'
}
}
}])
returns
{ "_id" : "new york", "timestamp" : 1524560400 }
{ "_id" : "san franciscoeo", "timestamp" : 1524560400 }
{ "_id" : "miami", "timestamp" : 1524560400 }
The actual collection and arrays are very large so client side processing won't be ideal. Any help would be much appreciated.