MapReduce, MongoDB and node-mongodb-native

Question

I'm using the node-mongodb-native library to run a MapReduce on MongoDB (from node.js).

Here's my code:

var map = function() {
        emit(this._id, {'count': this.count});
    };
var reduce = function(key, values) {
        return {'testing':1};
    };
collection.mapReduce(
    map,
    reduce,
    {
        query:{ '_id': /s.*/g },
        sort: {'count': -1},
        limit: 10,
        jsMode: true,
        verbose: false,
        out: { inline: 1 }
    },
    function(err, results) {
        logger.log(results);
    }
);

Two questions:

1) Basically, my reduce function is ignored. No matter what I put in it, the output remains just the result of my map function (no 'testing', in this case). Any ideas?

2) I get an error unless an index is defined on the field used for the sort (in this case - the count field). I understand this is to be expected. It seems inefficient as surely the right index would be (_id, count) and not (count), as in theory the _id should be used first (for the query), and only then the sorting should be applied to the applicable results. Am I missing something here? Is MongoDB inefficient? Is this a bug?

Thanks! :)

How are you executing that map/reduce? Could you paste the command? — Alberto Zaccagni, Jul 14 '13 at 23:03
The above is my complete code in node.js, using node-mongodb-native (which is the official client code supported by 10gen). node-mongodb-native runs the command for you when you call mapReduce for a collection. The code in the library starts with "Collection.prototype.mapReduce =" at: https://github.com/mongodb/node-mongodb-native/blob/master/lib/mongodb/collection.js — Assaf Hershko, Jul 14 '13 at 23:09
I do get the results in the callback, it's just that it looks like the reduce function is ignored... — Assaf Hershko, Jul 14 '13 at 23:15
So you don't call it like `db.inline.find()`, you just invoke `mapReduce` on the collection? — Alberto Zaccagni, Jul 14 '13 at 23:15
Yep. The library creates and executes a DB command based on it. The library code is at the link in my previous comment. — Assaf Hershko, Jul 14 '13 at 23:20

score 4 · Answer 1 · answered Aug 13 '13 at 11:17

The reason why the reduce function is never called is due to you emitting a single value for each key so there is no reason for the reduce function to actually execute. Here is an example of how you trigger the reduce function

collection.insert([{group: 1, price:41}, {group: 1, price:22}, {group: 2, price:12}], {w:1}, function(err, r) {

// String functions
var map = function() {
        emit(this.group, this.price);
    };

var reduce = function(key, values) {
        return Array.sum(values);
    };

collection.mapReduce(
    map,
    reduce,
    {
        query:{},
        // sort: {'count': -1},
        // limit: 10,
        // jsMode: true,
        // verbose: false,
        out: { inline: 1 }
    },
    function(err, results) {
      console.log("----------- 0")
      console.dir(err)
      console.dir(results)
        // logger.log(results);
    }
);

Notice that we are emitting by the "group" key meaning there is n >= 0 entries grouped by the "group" key. Since you are emitting _id each key is unique and thus the reduce function is not needed.

http://docs.mongodb.org/manual/reference/command/mapReduce/#requirements-for-the-reduce-function

MapReduce, MongoDB and node-mongodb-native

1 Answers1