0

I have a mongo db like this:

db.templates.insertMany( [
   { _id: 1, uuid: "1", name: "t1", related_templates: [ "2", "2" ] },
   { _id: 2, uuid: "2", name: "t2", related_templates: [ "3", "3" ] },
   { _id: 3, uuid: "3", name: "t3", related_templates: [ "4", "4" ] },
   { _id: 4, uuid: "4", name: "t4"},
] )

As you can see, the data represents a tree structure, but supports duplicate references to the same child node. I'm trying to recursively fetch the whole tree starting from t1, including duplicate references.

The result would look like this:

{
    "_id" : 1,
    "uuid" : "1",
    "name": "t1",
    "related_templates" : [
        {
            "_id" : 2,
            "uuid" : "2",
            "name" : "t2",
            "related_templates" : [
                {
                    "_id" : 3,
                    "uuid" : "3",
                    "name" : "t3",
                    "related_templates" : [
                        {
                            "_id" : 4,
                            "uuid" : "4",
                            "name" : "t4"
                        },
                        {
                            "_id" : 4,
                            "uuid" : "4",
                            "name" : "t4"
                        }
                    ]
                },
                {
                    "_id" : 3,
                    "uuid" : "3",
                    "name" : "t3",
                    "related_templates" : [
                        {
                            "_id" : 4,
                            "uuid" : "4",
                            "name" : "t4"
                        },
                        {
                            "_id" : 4,
                            "uuid" : "4",
                            "name" : "t4"
                        }
                    ]
                }
            ]
        },
        ...(t2 repeats here)
    ]
}

The solution suggested on the Mongo website is here: https://docs.mongodb.com/manual/reference/operator/aggregation/lookup/#std-label-unwind-example. If there are no duplicate references, this solution works great, with a bit of modification even allowing for recursive lookups as well. However, in my situation I need to preserve duplicate lookups

I've also considered the legacy solution of using unwind + group. That solution does preserve duplicates, but I haven't figured out how to use it recursively.

I've also considered using the solution on the mongo website to fetch without duplicates, then doing something with a map to attach the fetched data to the original related_templates array. I think this would work, but it doesn't seem very elegant.

Is there an elegant/easier solution to do this that I'm missing?

1 Answers1

0

In case anyone ends up having this problem in the future I'll go ahead and post the solution I came up with:

db.templates.aggregate([
  {
    "$match": {'uuid': "1"}
  },
  {
    '$lookup': {
      'from': "templates",
      'let': { 'uuids': "$related_templates"},
      'pipeline': [
        { 
          '$match': { 
            '$expr': { 
              '$and': [
                { '$in': [ "$uuid",  "$$uuids" ] },
              ]
            }
          }
        }
      ],
      'as': "related_templates_objects"
    }
  },
  {
    $addFields: {
      "related_templates_objects_uuids": { 
        $map: {
          input: "$related_templates_objects",
          in: "$$this.uuid"
        }
      }
    }
  },
  {
    $addFields: {
      "related_templates": { 
        $map: {
          input: "$related_templates",
          in: {"$arrayElemAt":[
            "$related_templates_objects",
            {"$indexOfArray":["$related_templates_objects_uuids","$$this"]}
          ]}
        }
      }
    }
  },
  {"$project":{"related_templates_objects":0,"related_templates_objects_uuids":0}}
])

In summary:

  1. do the lookup without duplicates, called related_templates_objects.
  2. create an identical array to related_templates_objects, except extracting only the uuids, called related_templates_objects_uuids.
  3. Create the desired array of objects, by mapping each of the original references in related_templates to the correct object from related_templates_objects (the index of which is found with related_templates_objects_uuids).
  4. Project out the original two intermediate fields that were used to create the new related_templates.

Of course, this solution is not recursive. It is possible to recurse once by copying the last 4 elements of the outer pipeline into the inner pipeline. And then recurse x more times by following the same copy and paste formula, which I coded into my project.

Hope the solution helps someone.