0

I need to update values in an array in many MongoDB documents using a bulk write operation (the new array differs for each document). My question is: is there a way to update a subset of array values based on the element position range, passing an array to replace those values with?

Here is my document structure:

{
year: 2020,
location_id: 1,
values: [1.2 0 0 5.2 1.02 8 0 0 0 0 1.2 4]
}

Let's say I want to replace all values except the first one with this new array:

[1 2 5.1 2 4 0 87 1 0.2 2 9]

i.e., 'values' should become:

[1.2 1 2 5.1 2 4 0 87 1 0.2 2 9]

I know how to replace the array fully:

UpdateOne({'year': 2020, 'location_id': 1}, {'$set': {'values': [1.2 1 2 5.1 2 4 0 87 1 0.2 2 9]}})

But this first option requires me to query the collection first to get the first value of the array. I'd like to avoid that.

I also know how to replace values one by one with 11 separate updateone commands (here in python):

col.bulk_write([UpdateOne({'year': 2020, 'location_id': 1}, {'$set': {'values.1': 1}}),
                UpdateOne({'year': 2020, 'location_id': 1}, {'$set': {'values.2': 2}}),
                .... ,
                UpdateOne({'year': 2020, 'location_id': 1}, {'$set': {'values.11': 9}})],
               ordered = False)

This second option means I'm going to send millions of UpdateOne statements daily because I have several million documents that get updated every day and the array to be updated is much larger than in this simple example. I don't like that very much either.

Note that a related question is: Could this second option overload my server or is it not taking more resources than the first option anyway? The first option would mean a bulkwrite with, say 300'000 updateOne statements, each modifying a 365 long array, while the 2nd option would mean 300'000*364 updateOne statements, each modifying a single array element.

I would like to be able to do something like:

UpdateOne({'year': 2020, 'location_id': 1}, {'$set': {'values.1-end': [1 2 5.01 2 4 0 87 1 0.2 2 9]}})

i.e., specify a range of array positions to be replaced by a provided array. So far, I couldn't find how to do that. This would lead to a bulkwrite with 300'000 updateOne statements, each replace exactly those 364 values that need to be replaced.

NOTE: In this example it's replacing "all except the first value", but it may also be "all except the last value". Possibly also "replace elements at positions 4 to 9 by this array of length 6". I'm writing my code in python.

bguillod
  • 3
  • 1

1 Answers1

0

You can use an aggregation pipeline and use this for update:

db.collection.updateOne(
   {'year': 2020, 'location_id': 1},
   [{
      $set: {
         values: {
            $concatArrays: [
               [{ $arrayElemAt: ["$values", 0] }],
               [1, 2, 5.1, 2, 4, 0, 87, 1, 0.2, 2, 9]
            ]
         }
      }
   }]
)

For operations like "replace elements at positions 4 to 9 by this array of length 6" you may also use $slice

Also operator $range may help, see How to modify value in array by position at aggregation framework

Wernfried Domscheit
  • 54,457
  • 9
  • 76
  • 110
  • Thanks, that's exactly what I was looking for! – bguillod Feb 27 '20 at 10:26
  • Could it be this only works with mongodb version 4.2 onwards? I have 3.6 and it fails: BSON field 'update.updates.u' is the wrong type 'array', expected type 'object'. – bguillod Mar 02 '20 at 13:20
  • Yes but updateOne does not accept an aggregation pipeline before 4.2, right? So I can't use $concatArrays in my updateone if I'm before 4.2? – bguillod Mar 02 '20 at 13:37
  • Yes, using an aggregation pipeline in `updateOne` was added in 4.2, see [Update with an Aggregation Pipeline](https://docs.mongodb.com/manual/reference/method/db.collection.updateOne/index.html#updateone-behavior-aggregation-pipeline). Most likely you have to update the collection with a loop. `$set` was also introduced in 4.2, use `$addFields` instead. – Wernfried Domscheit Mar 02 '20 at 13:38
  • My workaround, in the end, looks as follows: `db.collection.updateOne( {'year': 2020, 'location_id': 1}, { $set: { values.1: 1, values.2: 2, (and so on until:) values.11: 0 } } } ) ` It's a pity no better solution exists before version 4.2 but at least there a way... – bguillod Mar 02 '20 at 15:01