3

I'm using a node based CMS system called Keystone, which uses MongoDB for a data store, giving fairly liberal control over data and access. I have a very complex model called Family, which has about 250 fields, a bunch of relationships, and a dozen or so methods. I have a form on my site which allows the user to enter in the required information to create a new Family record, however the processing time is running long (12s on localhost and over 30s on my Heroku instance). The issue I'm running into is that Heroku emits an application error for any processes that run over 30s, which means I need to optimize my query. All processing happens very quickly except one function. Below is the offending function:

const Family = keystone.list( 'Family' );

exports.getNextRegistrationNumber = ( req, res, done ) => {

    console.time('get registration number');

    const locals = res.locals;

    Family.model.find()
        .select( 'registrationNumber' )
        .exec()
        .then( families => {

            // get an array of registration numbers
            const registrationNumbers = families.map( family => family.get( 'registrationNumber' ) );

            // get the largest registration number
            locals.newRegistrationNumber = Math.max( ...registrationNumbers ) + 1;

            console.timeEnd('get registration number');

            done();

        }, err => {

            console.timeEnd('get registration number');

            console.log( 'error setting registration number' );

            console.log( err );

            done();
        });

};

the processing in my .then() happens in milliseconds, however, the Family.model.find() takes way too long to execute. Any advice on how to speed things up would be greatly appreciated. There are about 40,000 Family records the query is trying to dig through, and there is already an index on the registrationNumber field.

autoboxer
  • 1,358
  • 1
  • 16
  • 37

1 Answers1

6

It makes sense that the then() executes quickly but the find() takes a while; finding the largest value in a set of records is a relatively quick database operation while getting the set could potentially be very time-consuming depending on a number of factors.

If you are simply reading the data and presenting it to the user via REST or some sort of visual interface, you can make use of lean() which will return plain javascript objects. By default, you are returning a mongoose.Document which in your case is unnecessary as there does not appear to be any data manipulation after your read query; you are just getting the data.

More importantly, it appears that all you need is one record: the record with the largest registrationNumber. You should always use findOne() when you are looking for one record in any set of records to maximize performance.

See previous answer detailing using findOne in a node.js implementation, or see mongoDB documentation for general information about this collection method.

lax1089
  • 3,403
  • 3
  • 17
  • 37
  • Yes, finding a max stored value is something that databases are good at. – Sergio Tulentsev Jun 10 '17 at 15:24
  • Thanks for the help @lax1089. It looks like it was a number of factors, lean() being one of them. I didn't realize how efficient Mongo was a fetching data, the issue was everything that happened after the query that was causing the slowdown. The lesson I learned was even if the query itself is complex, the size of the objects that are returned are what's important. I added .lean(), but also utilized .where() to handle paring down the result set in the query instead of in logic after I received results, as well as .select() to select only the fields I needed. – autoboxer Jun 12 '17 at 18:54
  • 1
    Although it was node specific, it's worth noting that an easy way to do performance testing and tuning is by adding console.time( 'thing to time' ) and console.timeEnd( 'thing to time' ) around whatever you want to test. It'll spit out the number of milliseconds it took to perform that operation. I'm in the habit of putting that at the start and finish of every asynchronous function I create in new code to see if there are any long running blocks. If I find any, I'll change the scope of my test to determine exactly where the slowdown is happening. – autoboxer Jun 12 '17 at 18:56