I have data stored in MongoDB in the following format.
{
"_id" : ObjectId("570b487fb5360dd1e5ef840c"),
"internal_id" : 1,
"created_at" : ISODate("2015-07-14T10:08:38.994Z"),
"updated_at" : ISODate("2016-01-10T00:35:19.748Z"),
"ad_account_id" : 1,
"updated_time" : "2013-08-05T04:48:49-0700",
"created_time" : "2013-08-05T04:46:35-0700",
"name" : "Sale1",
"daily": [
{"clicks": 5000, "date": "2015-04-16"},
{"clicks": 5100, "date": "2015-04-17"},
{"clicks": 5030, "date": "2015-04-20"}
]
"custom_tags" : {
"Event" : {
"name" : "Clicks"
},
"Objective" : {
"name" : "Sale"
},
"Image" : {
"name" : "43c3fe7b262cde5f476ed303e472c65a"
},
"Goal" : {
"name" : "10"
},
"Type" : {
"name" : "None"
},
"Call To Action" : {
"name" : "None",
},
"Landing Pages" : {
"name" : "www.google.com",
}
}
I am trying to group individual documents by internal_id
to find the aggregate sum of clicks from say 2015-04-15
to 2015-04-21
using the aggregate
method.
In pymongo, when I try to do an aggregate
using just $project
on internal_id
, I get the results, but when I try to $project
custom_tags
fields, I get the following error:
OperationFailure: Exceeded memory limit for $group, but didn't allow external sort.
Pass allowDiskUse:true to opt in.
Following the answer here, I even changed my aggregate function to list(collection._get_collection().aggregate(mongo_query["pipeline"], allowDiskUse=True))
. But this still keeps throwing the earlier error.