0

I have a situation almost, but not quite, entirely unlike this one. I have a collection to which I must assign new ids based on the old ones.

I thought of doing it using aggregation, adding a new field to preserve the old id in a new field aptly called "oldId", then setting the _id to my new value (calculated by a function).

Then I'd like to merge that back into the collection (not all documents are adjusted in this matter — don't ask) but aye, there's the rub: I need to merge on _id on one side, and oldId on the other.

I need to merge the documents coming out of the pipeline into the existing collection on $$new.oldId === $_id.

How do I do that?

I thought of using an aggregation pipeline in the whenMatched of the merge, but it doesn't allow modification of the fields the merge was made on, which is exactly what I need to do.

I also thought of doing it using a regular updateMany, but unfortunately

Uncaught MongoServerError: After applying the update, the (immutable) field '_id' was found to have been altered to _id: [redacted]


The solution I ended up using, which worked in my case, was dropping the $match and modifying the transformation, to leave the id alone in some cases, then using $out to replace the entire collection.

Full example of my current solution:

db.foo.insertMany([
  {_id: 123, description: "This id needs to be changed to 369" },
  {_id: 234, description: "This id needs to be left alone" },
  {_id: 345, description: "This id needs to be left alone" }
]);

print("===========");

db.foo.find({});

print("===========");

db.foo.aggregate([
  {$addFields: {
    oldId: "$_id",
    _id: { $function: {
      body: function(id) {
        if (id === 123) {
          return id * 3;
        } else {
          return id;
        }
      },
      args: ["$_id"],
      lang: "js"
    }}
  }},
  {$out: "foo"}
]);

db.foo.find({});

But I'd still like to know if I could've done better. I don't like modifying documents when I don't need to and I'd like to use the more fine-grained control that $merge would've provided.

That would look something like this:

db.foo.aggregate([
  {$match:{
    _id: 123
  }},
  {$addFields: {
    oldId: "$_id",
    _id: { $function: {
      body: function(id) {
        return id * 3;
      },
      args: ["$_id"],
      lang: "js"
    }}
  }},
  {$merge: {
    into: "foo",
    on: "$$new.oldId === $_id", // this is what I'm trying to figure out
    whenMatched: "replace",
    whenNotMatched: "discard"
  }}
]);
SQB
  • 3,926
  • 2
  • 28
  • 49
  • Not clear what you mean, there is no `new` field. Apart from that, you cannot modify the `_id` of an existing document in MongoBD. Note, when you set `{ $addFields: { _id: ...}}` then you don't modify any existing data, however it turns into a "real" update when you use the `$merge` stage. – Wernfried Domscheit Jun 20 '23 at 10:44
  • @WernfriedDomscheit Yes, there is a `$$new` that references the outcome of the pipeline so far. So I need to match on the `_id` of the current collection being equal to `$$new.oldId` coming out of the pipeline. – SQB Jun 20 '23 at 10:47
  • In this case please show us. But as said, [you cannot modify](https://www.mongodb.com/docs/manual/core/document/#field-names) the `_id` field anyway. – Wernfried Domscheit Jun 20 '23 at 11:01
  • @WernfriedDomscheit but I'm not modifying the `_id` field, I'm replacing a document that happens to have a different id. – SQB Jun 20 '23 at 11:20
  • Please provide more details, i.e. all relevant parts of the aggregation pipeline (currently it does not say anything) sample input documents and desired results. – Wernfried Domscheit Jun 20 '23 at 11:36
  • @WernfriedDomscheit like so? – SQB Jun 20 '23 at 15:02

1 Answers1

0

In principle it could be this one:

db.collection.aggregate([
  { $match: { _id: 123 } },
  {
    $merge: {
      into: "collection",
      on: "_id",
      whenMatched: [
        {
          $set: {
            _id: { $function: {
              body: function(id) {
                return id * 3;
              },
              args: ["$_id"],
              lang: "js"
            }}
          }
        }
      ],
      whenNotMatched: "discard"
    }
  }
])

or this:

db.collection.aggregate([
  { $match: { _id: 123 } },
  {
    $addFields: {
      newId: { $function: {
        body: function(id) {
          return id * 3;
        },
        args: ["$_id"],
        lang: "js"
      }}
    }
  },
  {
    $merge: {
      into: "collection",
      on: "_id",
      whenMatched: [
        {
          $set: {
            _id: "$$new.newId",
            newId: "$$REMOVE"
          }
        }
      ],
      whenNotMatched: "discard"
    }
  }
])

However, as I told you more than once: you cannot modify the _id field - period. And the pipeline cannot modify the on field's value. So, there are even two constraints blocking your idea.

You need to insert a new document with new _id and delete the old one. I don't think you can do it with a single command.

Wernfried Domscheit
  • 54,457
  • 9
  • 76
  • 110