1

I've tried many answers to similar problems using $lookup, $unwind, and $match, but I can't get this to work for my sub-sub-subdocument situation.

I have this collection, Things:

{
    "_id" : ObjectId("5a7241f7912cfc256468cb27"),
    "name" : "Fortress of Solitude",
    "alias" : "fortress_of_solitude",
},
{
    "_id" : ObjectId("5a7247ec548c9ad042f579e2"),
    "name" : "Batcave",
    "alias" : "batcave",
},
{
    "_id" : ObjectId("6a7247bc548c9ad042f579e8"),
    "name" : "Oz",
    "alias" : "oz",
},

and this one-document collection, Venues:

{
    "_id" : ObjectId("5b9acabbbf71f39223f8de6e"),
    "name" : "The Office",
    "floors" : [ 
        {
            "name" : "1st Floor",
            "places" : [ 
                {
                    "name" : "Front Entrance",
                    "alias" : "front_entrance"
                }
            ] 
        }, 
        {
            "name" : "2nd Floor",
            "places" : [ 
                {
                    "name" : "Batcave",
                    "alias" : "batcave"
                },
                {
                    "name" : "Oz",
                    "alias" : "oz"
                }
           ]
        }
    ]
}

I want to return all the Things, but with the Venue's floors.places.name aggregated with each Thing if it exists if the aliases match between Things and Venues. So, I want to return:

{
    "_id" : ObjectId("5a7241f7912cfc256468cb27"),
    "name" : "Fortress of Solitude",
    "alias" : "fortress_of_solitude",
                                 <-- nothing added here because
                                 <-- it's not found in Venues
},
{
    "_id" : ObjectId("5a7247ec548c9ad042f579e2"),
    "name" : "Batcave",
    "alias" : "batcave",
    "floors" : [                        <-- this should be 
        {                               <-- returned 
            "places" : [                <-- because 
                {                       <-- the alias
                    name" : "Batcave"   <-- matches
                }                       <-- in Venues
            ]                           <-- 
        }                               <-- 
    ]                                   <--     
},
{
    "_id" : ObjectId("6a7247bc548c9ad042f579e8"),
    "name" : "Oz",
    "alias" : "oz",
    "floors" : [                        <-- this should be 
        {                               <-- returned 
            "places" : [                <-- because 
                {                       <-- the alias
                    name" : "Oz"        <-- matches
                }                       <-- in Venues
            ]                           <-- 
        }                               <-- 
    ]                                   <--     
}

I've gotten as far as the following query, but it only returns the entire Venues.floors array as an aggregate onto each Thing, which is way too much extraneous data aggregated. I just want to merge each relevant floor.place sub-subsubdocument from Venues into its corresponding Thing if it exists in Venues.

db.getCollection('things').aggregate([
  {$lookup: {from: "venues",localField: "alias",foreignField: "floors.places.alias",as: "matches"}},
  {
    $replaceRoot: { newRoot: { $mergeObjects: [ { $arrayElemAt: [ "$matches", 0 ] }, "$$ROOT" ] } }
  },
  { $project: { matches: 0 } }  
])

I'm struggling with existing answers, which seem to change at MongoDB version 3.2, 3.4, 3.6, or 4.2 to include or not include $unwind, $pipeline, and other terms. Can someone explain how to get a sub-sub-subdocument aggregated like this? Thanks!

RealHandy
  • 534
  • 3
  • 8
  • 27

2 Answers2

1

Since MongoDB v3.6, we may perform uncorrelated sub-queries which gives us more flexibility to join two collections.

Try this:

db.things.aggregate([
  {
    $lookup: {
      from: "venues",
      let: {
        "alias": "$alias"
      },
      pipeline: [
        {
          $unwind: "$floors"
        },
        {
          $project: {
            _id: 0,
            places: {
              $filter: {
                input: "$floors.places",
                cond: {
                  $eq: [
                    "$$alias",
                    "$$this.alias"
                  ]
                }
              }
            }
          }
        },
        {
          $match: {
            "places.0": {
              $exists: true
            }
          }
        },
        {
          $unset: "places.name"
        }
      ],
      as: "floors"
    }
  }
])

MongoPlayground

Valijon
  • 12,667
  • 4
  • 34
  • 67
  • Thanks! Your answer made floors the new root, and mostly works, but it returns all of the places in each result (it returns both fortress and batcave in the floors.places of batcave if both of them are defined in venues), so I'm going with the answer that uses addFields and group to get rid of the extraneous places. Your solution shows other userul bits, though. (I updated the example data to have fortress also listed in venues so you can see what I mean). – RealHandy Feb 14 '20 at 19:50
  • Added Oz, actually, to show the extraneous places output. – RealHandy Feb 14 '20 at 20:07
  • @RealHandy try again please, I've updated pipeline conditions – Valijon Feb 14 '20 at 20:28
  • Looks like your update makes places[] disappear completely, so that it's only floors[ { name: Batcave }] rather than floors[ { places: [ { name: Batcave } ] } ] – RealHandy Feb 14 '20 at 21:02
  • @RealHandy check again :) – Valijon Feb 14 '20 at 21:34
1

You can try this :

db.things.aggregate([
    {
        $lookup:
        {
            from: "venues",
            let: { alias: "$alias" },
            pipeline: [
                { $unwind: { path: "$floors", preserveNullAndEmptyArrays: true } },
                { $match: { $expr: { $in: ['$$alias', '$floors.places.alias'] } } },
                /**  Below stages are only if you've docs like doc 2 in Venues */
                { $addFields: { 'floors.places': { $filter: { input: '$floors.places', cond: { $eq: ['$$this.alias', '$$alias'] } } } } },
                { $group: { _id: '$_id', name: { $first: '$name' }, floors: { $push: '$floors' } } },
                {$project : {'floors.places.alias': 1, _id :0}} // Optional
            ],
            as: "matches"
        }
    }
])

Test : MongoDB-Playground

whoami - fakeFaceTrueSoul
  • 17,086
  • 6
  • 32
  • 46
  • 1
    I like the use of replaceRoot in the @Valijon answer to make floors the root of the added-on data rather than matches. My couple of tries at adding that haven't worked (I'm def a noob at these mongodb syntaxes). Other than that, though, your answer does deliver exactly what I was hoping for, which is to have all the extraneous floors and places data removed. Thanks! – RealHandy Feb 14 '20 at 19:55
  • @RealHandy : I'm a bit confused when you said *floors the root of the added-on data rather than matches* (You certainly do bring `floors` to top level but what if you've two floors ?) Please raise another question with all current issues & tag it here we can certainly help you on that.. – whoami - fakeFaceTrueSoul Feb 14 '20 at 20:14
  • let me try to clarify ( as do many people, i wish you could format comments). the other solution has floors[], still an array, as the subdocument added on to each Thing in the output. So if, say, Elevator 1 was a Thing and was present in both floors of the venue, then Elevator 1 would have floors, an array of two elements, added on. In your solution, matches[0] is the root of the added-on data. There's only one venue, so there will always only be one element in matches[]. But if I added a 2nd venue that also has an Elevator 1 in it, then I'd need your matches[] to show both venues. – RealHandy Feb 14 '20 at 20:26
  • I was just liking having matches[] removed from the root of the added-on output because it's a given that there's only one match (venue) in my scenario. – RealHandy Feb 14 '20 at 20:28
  • @RealHandy : Do you want entire venues object on floor field on just `"name": "Batcave"` ? – whoami - fakeFaceTrueSoul Feb 14 '20 at 20:43
  • 1
    I’m not sure whether I’ll end up needing more fields than name, but it’s clear to me how to make those edits to your solution. – RealHandy Feb 15 '20 at 15:50