1

I have a collection called shops. Structure is like:

[
     {
          '_id' : id1,
          'details' : {name: 'shopA'},
          'products' : [{
               _id: 'p1',
               details:  {
                    'name': 'product1'
               }
          },{
               _id: 'p2',
               details:  {
                    'name': 'product2'
               }
          }, {
               _id: 'p4',
               details:  {
                    'name': 'product4'
               }
          }
     },{
          '_id' : id2,
          'details' : {name: 'shopB'},
          'products' : [{
               _id: 'p1',
               details:  {
                    'name': 'product1'
               }
          },{
               _id: 'p4',
               details:  {
                    'name': 'product4'
               }
          }, {
               _id: 'p5',
               details:  {
                    'name': 'product5'
               }
          }
     },{
          '_id' : id3,
          'details' : {name: 'shopC'},
          'products' : [{
               _id: 'p1',
               details:  {
                    'name': 'product1'
               }
          },{
               _id: 'p2',
               details:  {
                    'name': 'product2'
               }
          }, {
               _id: 'p3',
               details:  {
                    'name': 'product3'
               }
          }
     },{
          '_id' : id4,
          'details' : {name: 'shopOther'},
          'products' : [{
               _id: 'p10',
               details:  {
                    'name': 'product10'
               }
          },{
               _id: 'p12',
               details:  {
                    'name': 'product12'
               }
          }, {
               _id: 'p13',
               details:  {
                    'name': 'product13'
               }
          }
     }
]

Now user can select some of the products from menu and try to find the shops for those. The result should be all the shops which provide atleast one of the selected items.

Example,

Suppose users select ['p1', 'p2', 'p3'] //ids Then only three shops id1, id2, id3 will be listed(id4 has none of these items), plus the structure is such that it removes rest of the products of a shop(which were not listed) from the document in the results array.

Is there a way, I can get such result from mongodb directly?

Neil Lunn
  • 148,042
  • 36
  • 346
  • 317
Kop4lyf
  • 4,520
  • 1
  • 25
  • 31
  • See the answers listing `.aggregate()` and also possibly `$redact` further down the responses to [this question](http://stackoverflow.com/questions/3985214/retrieve-only-the-queried-element-in-an-object-array-in-mongodb-collection) – Neil Lunn Mar 19 '15 at 07:52
  • ok, I will check that – Kop4lyf Mar 19 '15 at 07:56
  • $redact should work for stripping down the result, but how do I search only those documents which have atleast one of the elements in the query list? – Kop4lyf Mar 19 '15 at 08:06
  • 1
    See the documentation for [$match](http://docs.mongodb.org/manual/reference/operator/aggregation/match/) for general queries and the [$in](http://docs.mongodb.org/manual/reference/operator/query/in/) operator for the query selection as well as [dot notation](http://docs.mongodb.org/manual/core/document/#dot-notation). Specifically `{ "$match": { "products._id": { "$in": ["p1","p2","p3"] } }}`. Further reading on the aggregation pipeline is also in the core documentation. As well as many samples here [tag:aggregation-framework] – Neil Lunn Mar 19 '15 at 08:12
  • I think that does my job. Thanks. – Kop4lyf Mar 19 '15 at 08:14

1 Answers1

2

Since you did ask nicely and also so well formed then there is some consideration that similar answers may not actually suit for reference, especially if your experience level with the MongoDB product is low.

Options like $redact may seem simple, and they are often well suited. But this is not a case for how you would need to construct the statement:

db.collection.aggregate([
  { "$match": { "products._id": { "$in": ["p1","p2","p3"] } }},
  { "$redact": {
    "$cond": {
      "if": {
        "$or": [
          { "$eq": [ "$_id", "p1" ] },
          { "$eq": [ "$_id", "p2" ] },
          { "$eq": [ "$_id", "p3" ] }
        ]
      },
      "then": "$$DESCEND",
      "else": "$$PRUNE"
    }
  }}
])

That works with the "not so obvious" use of $or in an aggregation operator. At least in terms of the correct syntax and form, but it is actually a "complete fail". The reasoning is that because $redact is generally a "recursive" operation, and it inspects at "all levels" of the document and not just at a specific level. So of your in the "top level" the _id assertion will fail as that top level field of the same name is not going to match that condition.

There really isn't anything else you can really do about this, but considering that _id in the array is actually a "unique" element then you can always perform this operation in an $project stage with the help of $map and $setDifference:

db.collection.aggregate([
  { "$match": { "products._id": { "$in": ["p1","p2","p3"] } }},
  { "$project": {
    "details": 1,
    "products": {
      "$setDifference": [
        { "$map": {
          "input": "$products",
          "as": "el",
          "in": {
            "$cond": {
              "if": { 
                "$or": [
                  { "$eq": [ "$$el._id", "p1" ] },
                  { "$eq": [ "$$el._id", "p2" ] },
                  { "$eq": [ "$$el._id", "p3" ] }
                ]
              },
              "then": "$$el",
              "else": false
            }
          }
        }},
        [false]
      ]
    }
  }}
])

It seems lengthy, but it actually very efficient. The $map operator processes arrays "inline" for each document and acting on each element to produce a new array. The false assertion made under $cond where the condtions are not a match is balanced by considering the "set" of results in comparison to $setDifference, which effectively "filters" the false results from the resulting array, leaving only the valid matches behind.

Of course where the _id values or entire objects were not truly "unique" then a "set" would no longer be valid. With this consideration, as well as the truth that the mentioned operators are not available to versions of MongoDB prior to 2.6, then the more tradtional approach is to $unwind the array members and then "filter" them via a $match operation.

db.collection.aggregate([
  { "$match": { "products._id": { "$in": ["p1","p2","p3"] } }},
  { "$unwind": "$products" },
  { "$match": { "products._id": { "$in": ["p1","p2","p3"] } }},
  { "$group": {    
      "_id": "$_id",
      "details": { "$first": "$details" },
      "products": { "$push": "$products" }
  }}
])

Consideration is given that as per the other examples, the $match phase should be executed first in the pipeline in order to reduce the "possible" documents matching the condition. The "second" phase with $match does the actuall "filtering" of the document elements inside the array when in the "de-normalized" form.

Since the array was "deconstructed" by $unwind, the purpose of $group is to "re-build" the array, "filtered" from the elements that do not match the condition.

MongoDB also offers the positional $ operator in order to select matched array elements from a query condition. Like so:

db.collection.find(
    { "products._id": { "$in": ["p1","p2","p3"] },
    { "details": 1, "products.$": 1 }
)

But the problem here is that this operator only supports the "first" match on the conditions supplied in the query document. This is a design intent, and as yet there is no strict operator syntax to cater for more than a single match.

So your ultimate approach is currently to use the .aggregate() method in order to actually achieve the match filtering on inner arrays that you desire. Either that or filter the contents responded yourself in client code, depending on how palatable that ultimately is to you.

Neil Lunn
  • 148,042
  • 36
  • 346
  • 317