0

I have a large 3kk mongodb collection for which i need to convert one element from numeric string to number.

I'm using a mongo-shell script which works for small 100k element collection, please see below the script:

db.SurName.find().forEach(function(tmp){
    tmp.NUMBER = parseInt(tmp.NUMBER);
    db.SurName.save(tmp);
})

But after a dozen minutes of work I got an error (the error occurs even if the collection is smaller like 1kk):

MongoDB Enterprise Test-shard-0:PRIMARY> db.SurName.find().forEach(function(tmp){
...         tmp.NUMBER = parseInt(tmp.NUMBER);
... db.SurName.save(tmp);
...     })
2020-01-18T16:59:21.173+0100 E  QUERY    [js] Error: command failed: {
        "operationTime" : Timestamp(1579363161, 14),
        "ok" : 0,
        "errmsg" : "cursor id 4811116025485863761 not found",
        "code" : 43,
        "codeName" : "CursorNotFound",
        "$clusterTime" : {
                "clusterTime" : Timestamp(1579363161, 14),
                "signature" : {
                        "hash" : BinData(0,"EemWWenbArSdh4dTFa0aNcfAPms="),
                        "keyId" : NumberLong("6748451824648323073")
                }
        }
} : getMore command failed: {
        "operationTime" : Timestamp(1579363161, 14),
        "ok" : 0,
        "errmsg" : "cursor id 4811116025485863761 not found",
        "code" : 43,
        "codeName" : "CursorNotFound",
        "$clusterTime" : {
                "clusterTime" : Timestamp(1579363161, 14),
                "signature" : {
                        "hash" : BinData(0,"EemWWenbArSdh4dTFa0aNcfAPms="),
                        "keyId" : NumberLong("6748451824648323073")
                }
        }
} :
_getErrorWithCode@src/mongo/shell/utils.js:25:13
doassert@src/mongo/shell/assert.js:18:14
_assertCommandWorked@src/mongo/shell/assert.js:583:17
assert.commandWorked@src/mongo/shell/assert.js:673:16
DBCommandCursor.prototype._runGetMoreCommand@src/mongo/shell/query.js:802:5
DBCommandCursor.prototype._hasNextUsingCommands@src/mongo/shell/query.js:832:9
DBCommandCursor.prototype.hasNext@src/mongo/shell/query.js:840:16
DBQuery.prototype.hasNext@src/mongo/shell/query.js:288:13
DBQuery.prototype.forEach@src/mongo/shell/query.js:493:12
@(shell):1:1

Is there a way to do this better/right?

EDIT: The obj schema:

{"_id":{"$oid":"5e241b98c7cab1382c7c9d95"},
"SURNAME":"KOWALSKA",
"SEX":"KOBIETA",
"TERYT":"0201011",
"NUMBER":"51",
"COMMUNES":"BOLESŁAWIEC",
"COUNTIES":"BOLESŁAWIECKI",
"PROVINCES":"DOLNOŚLĄSKIE"
}
Szelek
  • 2,629
  • 5
  • 17
  • 25
  • Does this answer your question? [MongoDB CursorNotFound Error on collection.find() for a few hundred small records](https://stackoverflow.com/questions/51526688/mongodb-cursornotfound-error-on-collection-find-for-a-few-hundred-small-record) – ArielGro Jan 19 '20 at 13:16

2 Answers2

1

** EDIT - START **

Googling "cursor id not found code 43", yielded this answer: https://stackoverflow.com/a/51602507/2279082

** EDIT - END **

I don't have your data set so I cannot test my answer very well. That being said, you can try to Update the specific field (see about update in the docs: db.collection.update)

So your script will look like this:

db.SurName.find({}, {NUMBER: 1}).forEach(function(tmp){
    db.SurName.update({_id: tmp._id}, {$set: {NUMBER: parseInt(tmp.NUMBER)}});
})

Let me know if it helps or if needs an edit

ArielGro
  • 795
  • 1
  • 11
  • 24
  • It's giving me the same error. – Szelek Jan 19 '20 at 12:36
  • Assuming it is a memory issue, and all you need is the `_id` and the `NUMBER` fields, you can run the `find` part as follows: `db.SurName.find({}, {NUMBER: 1})` (I edited my answer to match). The first `{}` is to match all collection items. The second one will return just the `_id` and `NUMBER` fields – ArielGro Jan 19 '20 at 13:00
  • Googling "cursor id not found code 43", yielded this answer: https://stackoverflow.com/a/51602507/2279082 – ArielGro Jan 19 '20 at 13:13
1

The best and fast solution is to use mongodb aggregation with $out operator.

Equivalent to:

insert into new_table
select * from old_table

We convert NUMBER field with $toInt (MongoDB version >= 4.0) operator and store documents in the SurName2 collection. Once we have finished, we just drop old collection and rename SurName2 collection to SurName.

db.SurName.aggregate([
  {$addFields:{
    NUMBER : {$toInt:"$NUMBER"}
  }},
  {$out: "SurName2"}
])

Once you check everything is fine, execute these sentences:

db.SurName.drop()
db.SurName2.renameCollection("SurName")
Valijon
  • 12,667
  • 4
  • 34
  • 67
  • Your solution works. Is there a way for doing this on the same collection (without copying)? The thing is, I don't have room for mirroring the collection. Will it be: db.SurName.aggregate([ {$addFields:{ NUMBER : {$toInt:"$NUMBER"} }} ]) ? – Szelek Jan 19 '20 at 12:43
  • aggregation was not made for this purpose.. – ArielGro Jan 19 '20 at 13:01
  • @szelek you may export Surname collection, modify NUMBER to integer: search `NUMBER" :"(.*?)"` replace NUMBER" : $1 . Then import again with --drop option – Valijon Jan 19 '20 at 15:22
  • @Valijon - feels like an overkill to me.... – ArielGro Jan 19 '20 at 15:25
  • Not able to import or export such large json ;( – Szelek Jan 19 '20 at 17:12
  • `monogoexport` / `mongoimport`?? – Valijon Jan 19 '20 at 17:20
  • @Szelek try: `path/to/mongoexport -d database -c SurName --out path/to/SurName.json` – Valijon Jan 19 '20 at 17:27
  • Finally i have done it by partial import via mongo-shell import tool mongoimport --host Test-shard-0/test-shard-00-00-gceee.mongodb.net:27017,test-shard-00-01-gceee.mongodb.net:27017,test-shard-00-02-gceee.mongodb.net:27017 --ssl --username --password --authenticationDatabase admin --db EngineeringProject --collection SurName --type json --file "C:\Users\Ja\Desktop\ConverterCSVtoJSON\STAT_NAZWISKA_KOBIETY3.json" --jsonArray – Szelek Jan 19 '20 at 17:54
  • 1
    @Szelek Excellent. Good luck – Valijon Jan 19 '20 at 18:05