8

It is basically what the title says.

Input: myArray = an array of words

I have an model that have field wordsCollection , which is an array field.

How can I find all documents of that model whose wordsCollections has at least n elements of myArray

Neil Lunn
  • 148,042
  • 36
  • 346
  • 317
Mr Cold
  • 1,565
  • 2
  • 19
  • 29
  • Show db structure and what yo have tried so far to make it work. – Shrabanee Jun 09 '16 at 12:25
  • I dont think the question is clear enough that I don't need to provide the db structure. I am not sure whether mongodb provides such an API call, so I am just thinking about iterating through all the documents.... Of course, it sounds really bad – Mr Cold Jun 09 '16 at 12:51
  • Are the items of `myArray` and `wordsCollection` unique? – Redu Jun 09 '16 at 13:33

2 Answers2

10

Let say we have the following documents in our collection:

{ "_id" : ObjectId("5759658e654456bf4a014d01"), "a" : [ 1, 3, 9, 2, 9, 0 ] }
{ "_id" : ObjectId("5759658e654456bf4a014d02"), "a" : [ 0, 8, 1 ] }
{ "_id" : ObjectId("5759658e654456bf4a014d03"), "a" : [ 0, 8, 432, 9, 34, -3 ] }
{ "_id" : ObjectId("5759658e654456bf4a014d04"), "a" : [ 0, 0, 4, 3, 2, 7 ] }

and the following input array and n = 2

var inputArray = [1, 3, 0];

We can return those documents where the array field contains at least n elements of a given array using the aggregation framework.

The $match selects only those documents with the array's length greater or equals to n. This reduce the amount of data to be processed in down in the pipeline.

The $redact pipeline operator use a logical condition processing using the $cond operator and the special operations $$KEEP to "keep" the document where the logical condition is true or $$PRUNE to "discard" the document where the condition is false.

In our case, the condition is $gte which returns true if the $size of the intersection of the two arrays, which we compute using the $setIntersection operator is greater than or equal 2.

db.collection.aggregate(
    [ 
        { "$match": { "a.1": { "$exists": true } } }, 
        { "$redact": { 
            "$cond": [ 
                { "$gte": [ 
                    { "$size": { "$setIntersection": [ "$a", inputArray ] } }, 
                    2
                ]},
                "$$KEEP", 
                "$$PRUNE" 
            ]
        }}
    ]
)

which produces:

{ "_id" : ObjectId("5759658e654456bf4a014d01"), "a" : [ 1, 3, 9, 2, 9, 0 ] }
{ "_id" : ObjectId("5759658e654456bf4a014d02"), "a" : [ 0, 8, 1 ] }
{ "_id" : ObjectId("5759658e654456bf4a014d04"), "a" : [ 0, 0, 4, 3, 2, 7 ] }
styvane
  • 59,869
  • 19
  • 150
  • 156
0

Use aggregation.

In $match aggregation pipeline, you can use $size and $gte

libik
  • 22,239
  • 9
  • 44
  • 87
  • I'm gonna try, that could be pretty usefull if it's work to avoid keep updating "count" field! Dunno why you got downvoted tho, I upvote back – Julien Leray Jun 09 '16 at 12:48
  • @libik: I'll give it a try – Mr Cold Jun 09 '16 at 12:53
  • 1
    You can't use `$size` in `$match` here. I failed to see how this solve the problem in hand here. Btw @JulienLeray, you don't upvote an answer just because it has been downvoted. Vote based on content's quality. – styvane Jun 09 '16 at 13:37