2

I am doing a website to explore mongoDB data. In my database I store GPS measurements captured from smartphones. I am using various queries to explore those measurements. I have one query that groups by day and count the measurements. Another query counts the number of measurements for each kind of smartphone (iOS, Android, ). Etc..

All these queries share the same $match parameters in their aggregation pipeline . In this pipeline I filter the measurement in order to focus in an interval of time and in a geographical area.

Is there a way to keep the subset obtained in the $match in the cache in a manner that the database do not need to apply this filter every time ?

I want to optimize the response time of my queries.

Sample of one the query :

cursor = db.myCollection.aggregate(
   [
    {
        "$match":
        {
            "$and": [{"t": {"$gt": tMin, "$lt": tMax}, "location":{"$geoWithin":{"$geometry":square}}}]
        }
    },
    {
        "$group":
        {
           "_id": {"hourGroup": "$tHour"},
           "count": {"$sum": 1}
        }
    }
   ]
)

I want to keep the result of this in the cache :

    "$match":
    {
        "$and": [{"t": {"$gt": tMin, "$lt": tMax}, "location":{"$geoWithin":{"$geometry":square}}}]
    }
SwissFr
  • 178
  • 1
  • 12

1 Answers1

1

The way you could do it is to create a new collection using $out pipeline stage.

Then as you will go with the query batch the first query will created a matched output and next ones could use it results.

There is a new pipeline stage in development called $facet where we will be able to execute match and then use this result in multiple aggregation path (plan is to have it ready in mongo 3.4)

Any comments welcome!

profesor79
  • 9,213
  • 3
  • 31
  • 52
  • My dataset is very large. The creation of a new collection might be problematic in term of storage. $facet might solve it in the future but I need an alternative at the moment. – SwissFr Jul 01 '16 at 15:36