3

I'm trying to do a simple find on my test collection

Here's an entry sample:

{
    "_id": "movie:1",
    "title": "Vertigo",
    "year": 1958,
    "genre": "drama",
    "summary": "Scottie Ferguson, ancien inspecteur de police, est sujet au vertige depuis qu'il a vu mourir son collègue. Elster, son ami, le charge de surveiller sa femme, Madeleine, ayant des tendances suicidaires. Amoureux de la jeune femme Scottie ne remarque pas le piège qui se trame autour de lui et dont il va être la victime... ",
    "country": "DE",
    "director":     {
        "_id": "artist:3",
        "last_name": "Hitchcock",
        "first_name": "Alfred",
        "birth_date": "1899"    
        },
    "actors": [
        {
        "_id": "artist:15",
        "first_name": "James",
        "last_name": "Stewart",
        "birth_date": "1908",
        "role": "John Ferguson" 
        },
        {
        "_id": "artist:16",
        "first_name": "Kim",
        "last_name": "Novak",
        "birth_date": "1925",
        "role": "Madeleine Elster"  
        },
        {
        "_id": "artist:282",
        "first_name": "Arthur",
        "last_name": "Pierre",
        "birth_date": null,
        "role": null    
        }
    ]
}

I would like to find movies where a director is also an actor. Is it possible to do it with a simple $elemMatch like :

find({actors: {$elemMatch: {"_id": "this.director._id"} })

Thanks!

styvane
  • 59,869
  • 19
  • 150
  • 156
RobinFrcd
  • 4,439
  • 4
  • 25
  • 49

2 Answers2

1

If you just need to compare the ids of director with actors and there will be only one director by several actors, you can use $setIsSubset

db.a.aggregate([
{
    $group : {
        _id : "$_id",
        director_id : {$push : "$director._id"},
        actors_ids : {$first : "$actors._id"}
    }
},
{
    $project : {
        _id : 1,
        directorIsActor : {
            $setIsSubset : ["$director_id", "$actors_ids"]
        }
    }
}
])
ares
  • 4,283
  • 6
  • 32
  • 63
1

From the previously linked dupe (possible), a solution using $where would follow:

db.collection.find({
    "$where": function() {
        self = this;
        return this.actors.filter(function(actor) {
            return self.director._id === actor._id;
        }).length > 0
    }
})

And the other suggested approach which uses the aggregation framework $redact pipeline:

db.collection.aggregate([
    { 
        "$redact": { 
            "$cond": [
                { 
                    "$setIsSubset": [ 
                        ["$director._id"], 
                        {
                            "$map": {
                                "input": "$actors",
                                "as": "el",
                                "in": "$$el._id"
                            }
                        }
                    ] 
                },
                "$$KEEP",
                "$$PRUNE"
            ]
        }
    }
])

In the above, the condition logic for $redact is done through the use of set operators $setIsSubset and $map.

The $map operator will return an array with just the actor id's from the actors array after applying an expression to each element in the array. So for example, the expression

{
    "$map": {
        "input": "$actors",
        "as": "el",
        "in": "$$el._id"
    }
}

if applied on the actors array

[ 
    {
        "_id" : "artist:3",
        "first_name" : "James",
        "last_name" : "Stewart",
        "birth_date" : "1908",
        "role" : "John Ferguson"
    }, 
    {
        "_id" : "artist:16",
        "first_name" : "Kim",
        "last_name" : "Novak",
        "birth_date" : "1925",
        "role" : "Madeleine Elster"
    }, 
    {
        "_id" : "artist:282",
        "first_name" : "Arthur",
        "last_name" : "Pierre",
        "birth_date" : null,
        "role" : null
    }
]

will return

[ "artist:3", "artist:16", "artist:282" ]

This result is compared with a single element array ["$directors._id"] using the $setIsSubset operator which takes two arrays and returns true when the first array is a subset of the second, including when the first array equals the second array, and false otherwise.

For example,

{ 
    "$setIsSubset": [ 
        [ "artist:12" ], 
        [ "artist:3", "artist:16", "artist:282" ] 
    ] 
}       // false

{ 
    $setIsSubset: [ 
        [ "artist:282" ], 
        [ "artist:3", "artist:16", "artist:282" ] 
    ] 
}       // true

The boolean result from the operator is then used as the basis for the $redact pipeline.

The explanations for performance still hold: $where is a good hack when necessary, but it should be avoided whenever possible.

Community
  • 1
  • 1
chridam
  • 100,957
  • 23
  • 236
  • 235