3

My structure is as follows:

{
    day: x,
    events:
    [
        {
            year: y,
            info: z
        }
    ]
}

Up to now I created the following query, which I does not return an error but does show anything either (which is wrong).

db.days.aggregate([
{ 
    $match: 
    {
        $and:
        [ 
            {
                'day': 'March_13' 
            },
            {
                'events.year': '1870'
            },
            {
                'events.info': {$regex: "./French./"}
            }
        ]
    }
},
{
    $unwind: {path: "$events"},
},
{ 
    $match: 
    {
        'info': { $regex: '.*French.*'}
    }
}])

From what I read I need to group by _id, but I do not know how to recreate the array with the objects that satisfied the second $match.

Could you please have a look and maybe tell me why the initial query does not work and advise me on a group block?

Some sample data is here:

{ 
    "day" : "March_13", 
    "events" : 
         [
             { 
                 "year" : "1929", 
                 "info" : "Peter Breck, American actor (d. 2012)" 
             }, 
             { 
                 "year" : "1929", 
                 "info" : "Joseph Mascolo, American actor" 
             }, 
             { 
                 "year" : "1929", 
                 "info" : "Zbigniew Messner, Polish economist and politician, 9th Prime Minister of the Republic of Poland (d. 2014)" 
             }, 
             { 
                 "year" : "1929", 
                 "info" : "Bunny Yeager, American model and photographer (d. 2014)" 
             }
         ] 
}

And here is if I would succeed in querying by the word "American":

{ 
    "day" : "March_13", 
    "events" : 
         [
             { 
                 "year" : "1929", 
                 "info" : "Peter Breck, American actor (d. 2012)" 
             }, 
             { 
                 "year" : "1929", 
                 "info" : "Joseph Mascolo, American actor" 
             }, 
             { 
                 "year" : "1929", 
                 "info" : "Bunny Yeager, American model and photographer (d. 2014)" 
             }
         ] 
}

Basically I want to check if the field info contains the searched word, and if it does I keep it in the array.

styvane
  • 59,869
  • 19
  • 150
  • 156

2 Answers2

0

You'd want to try running the following aggregation pipeline for the above example to get the desired result:

db.days.aggregate([
    {
        "$match": {
            "day" : "March_13",
            "events.year": "1929",
            "events.info": /American/
        }
    },
    { "$unwind": "$events" },
    {
        "$match": {
            "day" : "March_13",
            "events.year": "1929",
            "events.info": /American/
        }
    },
    {
        "$group": {
            "_id": "$_id",
            "day": { "$first": "$day" },
            "events": { "$push": "$events" }
        }
    }

])

Sample Output

/* 0 */
{
    "result" : [ 
        {
            "_id" : ObjectId("5706b38dcc578484faab815f"),
            "day" : "March_13",
            "events" : [ 
                {
                    "year" : "1929",
                    "info" : "Peter Breck, American actor (d. 2012)"
                }, 
                {
                    "year" : "1929",
                    "info" : "Joseph Mascolo, American actor"
                }, 
                {
                    "year" : "1929",
                    "info" : "Bunny Yeager, American model and photographer (d. 2014)"
                }
            ]
        }
    ],
    "ok" : 1
}
chridam
  • 100,957
  • 23
  • 236
  • 235
0

This would be simple if we could use the $regex with the $cond operator or the $filter operator. That being said you have two options, the first one is to use the aggregation framework (as mention in this answer) and the native aggregation pipeline operators which will be faster as coded in C++ but in the pipeline you need to use the $unwind operator and if you are dealing with large array, the size of your documents after de-nomalization may exceed the 16MB in which case the aggregation query will fail. If that happen, you can do this using mapReduce

function map() { 
    var events =  this.events.filter(function(element) { 
        return (/American/i).test(element.info) && element.year === "1929";
        });
    emit(this.day, events);
}

db.collection.mapReduce(
    map, 
    function(key, value) {}, 
    { out: { inline: 1 } },
    { query: { "day": "March_13" } }
)

Which returns:

{
    "results" : [
            {
                "_id" : "March_13",
                "value" : [
                    {
                        "year" : "1929",
                        "info" : "Peter Breck, American actor (d. 2012)"
                    },
                    {
                        "year" : "1929",
                        "info" : "Joseph Mascolo, American actor"
                    },
                    {
                        "year" : "1929",
                        "info" : "Bunny Yeager, American model and photographer (d. 2014)"
                    }
                ]
            }
    ],
    "timeMillis" : 27,
    "counts" : {
        "input" : 1,
        "emit" : 1,
        "reduce" : 0,
        "output" : 1
    },
    "ok" : 1
}
Community
  • 1
  • 1
styvane
  • 59,869
  • 19
  • 150
  • 156