MongoDB NodeJS driver, how to know when .update() 's are complete

Question

As the code is quite large to posted in here, I append my github repo https://github.com/DiegoGallegos4/Mongo

I am trying to use de NodeJS driver to update some records fulfilling a criteria but first I have to find some records fulfilling another criteria. On the update part, the records found and filter from the find operation are used. This is,

file: weather1.js

MongoClient.connect(some url, function(err,db){
    db.collection(collection_name).find({},{},sort criteria).toArray(){
          .... find the data and append to an array
          .... this data inside a for loop
          db.collection(collection_name).update(data[i], {$set...}, callback)
    }
})

That´s the structure used to solve the problem, relating when to close the connection , it is when the length of the data array equals the number of callbacks on the update operation. For more details you can refer to the repo.

file: weather.js

On the other approach, Instead of toArray is used .each to iterate on the cursor.

I've looked up for a solution to this for a week now on several forums.

I've read about pooling connections but I want to know what is my conceptual error on my code. I would appreciate a deep insight on this topic.

Blakes Seven · Accepted Answer · 2015-09-03T07:43:41.613

The way you pose your question is very misleading. All you want to know is "When is the processing complete so I can close?".

The answer to that is you need to respect the callbacks generally only move through the cursor of results once each update is complete.

The simple way without other dependencies is to use the stream interface suported by the driver:

var MongoClient = require('mongodb').MongoClient;

MongoClient.connect('mongodb://localhost:27017/data',function(err,db){

    if(err) throw err;

    coll = db.collection('weather');
    console.log('connection established')

    var stream = coll.find().sort([['State',1],['Temperature',-1]])

    stream.on('err',function(err) {
        throw err;
    });

    stream.on('end',function() {
        db.close();
    });

    var month_highs = [];
    var state = '';
    var length = 0;


    stream.on('data',function(doc) { 
        stream.pause();                           // pause processing documents

        if (err) throw err;

        if (doc) {
            length = month_highs.length
            if(state != doc['State']){
                month_highs.push(doc['State']);
                //console.log(doc);
            }
            state = doc['State']

            if(month_highs.length > length){
                coll.update(doc, {$set : {'month_high':true} }, function(err, updated){
                    if (err) throw err;
                    console.log(updated)
                    stream.resume();              // resume processing documents
                });
            } else {
                stream.resume();
            }
        } else {
            stream.resume();
        }
    });

});

That's just a copy of the code from your repo, refactored to use a stream. So all the important parts are where the word "stream" appears, and most importantly where they are being called.

In a nutshell the "data" event is emitted by each document from the cursor results. First you call .pause() so new documents do not overrun the processing. Then you do your .update() and within it's callback on return you call .resume(), and the flow continues with the next document.

Eventually "end" is emitted when the cursor is depleted, and that is where you call db.close().

That is basic flow control. For other approaches, look at the node async library as a good helper. But do not loop arrays with no async control, and do not use .each() which is DEPRECATED.

You need to signal when the .update() callback is complete to follow a new "loop iteration" at any rate. This is the basic no additional dependancy approach.

P.S I am a bit suspect about the general logic of your code, especially testing if the length of something is greater when you read it without possibly changing that length. But this is all about how to implement "flow control", and not to fix the logic in your code.

How could I improve the testing logic? Every time a new item is added to the array the I want to add that item, that how I state it. Look at this error thrown by the code above https://github.com/DiegoGallegos4/Mongo/blob/master/error.rtf. My version of mongo is 3.0.5 if that can help you. — Diego Gallegos, Sep 03 '15 at 08:00
@DiegoGallegos Really not sure what you are really trying to do here, so perhaps you should fully ask that in another question. Your question here was basically about the "flow control" and signalling when each update and all updates were complete ( though you could have been a lot more to the point about that ). If you have other questions then [Ask A new Question](http://stackoverflow.com/questions/ask). We can only really answer the question asked, and it's clearer to keep your "questions" separate. — Blakes Seven, Sep 03 '15 at 08:08

MongoDB NodeJS driver, how to know when .update() 's are complete

1 Answers1