1

I have a case where in first collection I use DBRef to another collection. First collection is Books, the second is Users (who read those books). The user can have avatars and various other informations, which is reasonable to keep in separate collection. But now I need to shard the books collection. If I shard it amongst 2 nodes, how the Users collection will be sharded? I would like to keep users that are related to particular books in same node. Is that possible? Thanks!

  • Could you maybe give some information about your schema, you usage and your thoughts so far? As it stands this is broad and unclear, so you are unlikely to get a meaningful answer. You nee to edit with more detail like that. – Neil Lunn Mar 08 '14 at 08:17
  • What exactly is not clear? There are 2 collections, which are related with each other. This is general question on how should i shard 1 collection in order to have all related documents from other collection on the ssme node. – Nerses Kanayan Mar 08 '14 at 14:50
  • You do not explain what you intend to use for a shard key or are thinking of, nor do you explain why you wish for certain items to be kept together. How can anyone answer? Look at the answers you have received. No need to be aggressive when someone is trying to help. You have not explained yourself and the proof is in the results. – Neil Lunn Mar 08 '14 at 14:58
  • At the moment this is not possible out side of tag aware sharding and you might find that very difficult to maintain as such I wouldn't advise unless you are solely a DBA since you will literally be spending most of your time keeping it together with an ever expanding network like that. What I would do instead is shard the books on user_id and then shard the user collection on _id that way you only need to query two shards at most – Sammaye Mar 08 '14 at 15:37
  • Neil Lunn, sorry if I seemed aggressive, due to lack of my mongo knowledge it was not clear what other details shall I provide. The reason I want related shards to be kept on same instance, so I will not do lookup in the other collection over network. If I knew what the shard key should be, why would I ask ?:) Sammaye's answer seems to make perfect sense. thanks! – Nerses Kanayan Mar 08 '14 at 17:15

1 Answers1

1

At the moment this is not possible out side of tag aware sharding ( http://docs.mongodb.org/manual/core/tag-aware-sharding/ ). Kristina (when she was still with 10gen) wrote a good article on how to distribute your data, can easily be used to group multiple collections: http://www.kchodorow.com/blog/2012/07/25/controlling-collection-distribution/

However, you might find that very difficult to maintain as such I wouldn't advise it unless you are solely a DBA since you will literally be spending most of your time keeping it together with an ever expanding network like that.

What I would do instead is shard the books on user_id and then shard the user collection on hashed _id that way you only need to query two shards at most

Sammaye
  • 43,242
  • 7
  • 104
  • 146