0

I have one collection in which student_id is the primary key:

test1:{student_id:"xxxxx"},

I have another collection in which student_id is inside array of collection:

class:{"class":"I",students:["student_id":"xxxx"]}

My problem is I want to join these two tables on the basis of student Id,

I am using map reduce and out as "merge", but it won't work.

My MR query is as follows.

db.runCommand({ mapreduce: "test1", 
 map : function Map() {
    emit(this._id,this);
},
 reduce : function Reduce(key, values) {
    return values;
},

 out : { merge: "testmerge"  }
 });

 db.runCommand({ mapreduce: "class", 
 map : function Map() {
    emit(this._id,this);
},
 reduce : function Reduce(key, values) {
    return values;
},

 out : { merge: "testmerge"  }
 });

But it inserts two rows.

Can some one guide me regarding this,I am very new to MR

As in the example I want to get the details of all student from "test1" collection,studying in class "I".

Phalguni Mukherjee
  • 623
  • 3
  • 11
  • 29
  • Why do you want to "join" those two tables? Usually if you need to "join" it means you need to redesign your schema. You would be better off showing a full document out of each collection, and explaining what you'd like as result. – Derick Aug 07 '13 at 10:28
  • @Derick I am preparing a thin client to mongo db which needs to collect historical data and data is distributed in multiple collections, now I have to collect all data in one sort for analysis, without doing multiple read. – Phalguni Mukherjee Aug 07 '13 at 10:31
  • Add it to the question, not in comments please... – Derick Aug 07 '13 at 10:33
  • I edited the question.As in the example I want to get the details of all student from "test1" collection,studying in class "I". – Phalguni Mukherjee Aug 07 '13 at 10:34
  • You should consider a schema redesign rather than trying to force MongoDb to do something it wasn't designed to do. MapReduce in MongoDb, unlike some other NoSql DBs, isn't automatically updated. – WiredPrairie Aug 07 '13 at 10:59

1 Answers1

1

Your requirement seems to be:

As in the example I want to get the details of all student from "test1" collection,studying in class "I".

In order to do that, store which classes a student is in with the student:

{
    student_id: "xxxxx",
    classes: ["I"],
},

Then you can just ask for all the students information with:

db.students.find( { classes: "I" } );

Without any need for slow and complex map reduce jobs. In general, you should avoid Map/Reduce as it can't make use of indexes and can not run concurrently. You also need to understand that in MongoDB operations are only done on one collection. There is no such thing as a join, and trying to emulate this with Map/Reduce is a bad idea. At least you can just do it with two queries:

// find all students in class "I":
ids = []; 
db.classes.find( { class: "I" } ).forEach(function(e) { ids.push( e.student_id ) ; } );
// then with the result, find all of those students information:
db.students.find( { student_id: { $in: ids } } );

But I would strongly recommend you redesign your schema and store the classes with each student. As a general hint, in MongoDB you would store the relation between documents on the other side as compared to a relational database.

Derick
  • 35,169
  • 5
  • 76
  • 99
  • you are handling in the mongo shell, so I am using Mongo Driver, So I have to handle it in my application, and do two read,something I want to avoid. – Phalguni Mukherjee Aug 07 '13 at 11:07
  • Like I said, then you will have to redesign your schema. Do note that two reads is going to be a hell-of-a-lot faster than using Map/Reduce. Even though this is on the shell, I'm sure you understand what it does and rewrite it to Java—your question was all in Javascript as well anyway. – Derick Aug 07 '13 at 12:58