3

I have two collections:

USERS:

  • { id:"aaaaaa" age:19 , sex:"f" }
  • { id:"bbbbbb" age:30 , sex:"m" }

REVIEWS:

  • { id:777777 , user_id:"aaaaaa" , text:"some review data" }
  • { id:888888 , user_id:"aaaaaa" , text:"some review data" }
  • { id:999999 , user_id:"bbbbbb" , text:"some review data" }

I would like to findAll REVIEWS Where sex=f and age>18

( I dont want to nest because the reviews collection will be huge )

Mario S
  • 11,715
  • 24
  • 39
  • 47
Om Solari
  • 207
  • 1
  • 4
  • 13

2 Answers2

2

You should include user's data into each review (a.k.a. as denormalizing):

{ id:777777 , user: { id:"aaaaaa", age:19 , sex:"f" } , text:"some review data" }
{ id:888888 , user: { id:"aaaaaa", age:19 , sex:"f" } , text:"some other review data" }
{ id:999999 , user: { id:"bbbbbb", age:20 , sex:"m" } , text:"mome review data" }

Here, read this link on MongoDB Data Modeling:

A Note on Denormalization

Relational purists may be feeling uneasy already, as if we were violating some universal law. But let's bear in mind that MongoDB collections are not equivalent to relational tables; each serves a unique design objective. A normalized table provides an atomic, isolated chunk of data. A document, however, more closely represents an object as a whole. In the case of a social news site, it can be argued that a username is intrinsic to the story being posted.

What about updates to the username? It's true that such updates will be expensive; happily, in this case, they'll be rare. The read savings achieved in denormalizing will surely outweigh the costs of the occasional update. Alas, this is not hard and fast rule: ultimately, developers must evaluate their applications for the appropriate level of normalization.

Herman Junge
  • 2,759
  • 24
  • 20
  • Thank You. If USER data changes, this model would requires us to manually Update all nested USER data for every REVIEW. If we are indexing on the nested data the index would have to get rebuilt as well. For a user with 1000s of reviews any USER data change would require updates to 1000s of records. – Om Solari Oct 28 '12 at 20:44
  • Exactly. That's the downside. If you are going this path, everything you update the user info, you'll have to go on the other records' update. So you'll have to decide in which side of the tradeoff you'll want to be: Lot of updates or Lot of queries. It all depends on which you do more. – Herman Junge Oct 28 '12 at 20:49
  • 1
    Which part of user data do you expect will change? Sex? Unlikely. Age? Do you actually want to store their current age or their age when they created the review? – Asya Kamsky Oct 29 '12 at 05:17
1

Unless you de-normalize REVIEWS collection with your search attributes, MongoDB does not support querying another collection in a single query. See this post.

Community
  • 1
  • 1
  • What about DBREF? That doesn't count as querying another collection? – Om Solari Oct 28 '12 at 19:00
  • You are still limited to querying off of $id or _id. For other fields, you need to perform another query. Further, DBRef are [resolved client-side](http://docs.mongodb.org/manual/applications/database-references/) and depending on your driver, may or may not support automatic hydration. Which means, you end up doing another query anyway. –  Oct 28 '12 at 19:18