2

Let say we have user and post collection. In post collection, vote store the user name as a key.

db.user.insert({name:'a', age:12});
db.user.insert({name:'b', age:12});
db.user.insert({name:'c', age:22});
db.user.insert({name:'d', age:22});

db.post.insert({Title:'Title1', vote:[a]});
db.post.insert({Title:'Title2', vote:[a,b]});
db.post.insert({Title:'Title3', vote:[a,b,c]});
db.post.insert({Title:'Title4', vote:[a,b,c,d]});

We would like to group by the post.Title and find out the count of vote in different user age.

> {_id:'Title1', value:{ ages:[{age:12, Count:1},{age:22, Count:0}]} }
> {_id:'Title2', value:{ ages:[{age:12, Count:2},{age:22, Count:0}]} }
> {_id:'Title3', value:{ ages:[{age:12, Count:2},{age:22, Count:1}]} }
> {_id:'Title4', value:{ ages:[{age:12, Count:2},{age:22, Count:2}]} }

I have searched through and doesn't find a way to access 2 collection in mongodb mapreduce. Could it be possible to achieve in re-reduce?

I know it is much simple to embedded the user document in post, but it is not a nice way to do as the real user document have many properties. If we include the simplify version of user document, it will limit the dimension of analysis.

{Title:'Title1', vote:[{name:'a', age:12}]}
Has QUIT--Anony-Mousse
  • 76,138
  • 12
  • 138
  • 194
Kuroro
  • 1,841
  • 1
  • 19
  • 32
  • It is not possible to perform a map-reduce on multiple collection. Please also explain your rationale when you say "it is not nice way to do" with respect to embedding a document. Provide your design consideration if possible. – Samyak Bhuta Nov 01 '11 at 12:19

2 Answers2

1

MongoDB does not have a multi-collection Map / Reduce. MongoDB does not have any JOIN syntax and may not be very good for ad-hoc joins. You will need to denormalize this data in some way.

You have a few options:

Option #1: Embed the age with the vote.

{Title:'Title1', vote:[{name:'a', age:12}]}

Option #2: Keep a counter of the ages

{Title:'Title1', vote:[a, b], age: { "12" : 1, "22" : 1 }}

Option #3: Do a "manual" join

Your last option is to write script/code that does a for loop over both collections and merges the data correctly.

So you would loop over post and output a collection with the title and the list of votes. Then you would loop through the new collection and update the ages by looking up each user.

My suggestion

Go with #1 or #2.

Gates VP
  • 44,957
  • 11
  • 105
  • 108
0

Instead of

{name:'a', age:12}

It is easier to add a new field to user document and maintain it in each vote update.Of course, you can enjoy to use map reduce to analysis your data.

{name:'a', age:12, voteTitle:["Title1","Title2","Title3","Title4"]}
jcollum
  • 43,623
  • 55
  • 191
  • 321
Kuroro
  • 1,841
  • 1
  • 19
  • 32