9

What I need:

Suppose you're using MongoDB and you have a collection called users, and each user has a "following" array with user _ids of the people he's following. Then you have another collection statuses, with each status containing the _id of its author. How do you display to a certain user all the statuses added by people he's following?

What I tried:

I put all the users _ids that the current user is following in an array (I'm using PHP), then I used it to find all the statuses by those users using $in.

The question:

Is this the best solution?

ySgPjx
  • 10,165
  • 7
  • 61
  • 78

3 Answers3

3

I can't see any other way too, i implemented such thing before and didn't have a problem.

On your case, it should be sth like this, you pass certain user's $follower_ids array as an argument to your function:

$query  = array("status_owner_id" => array('$in' => $follower_ids));
$cursor = $mongo->yourdb->statuses->find($query);

And if you index statuses (if you have enough ram to do so) upon owner_id you'd get the results really fast.

Hope it helps, Sinan.

Sinan
  • 11,443
  • 7
  • 37
  • 48
1

Yea, I do the exact same thing. See what Dwight Merriman suggested on his blog.

http://dmerr.tumblr.com/post/463694595/just-for-fun-a-single-server-twitter-design

sdot257
  • 10,046
  • 26
  • 88
  • 122
0

What you tried is what every body think first however it's not really easy to scale... You can always add more servers or use sharding etc... If you have million of users and people who follow lots of people this solution would become really hard to execute.

There is another solution that is basically just doing the aggregation when someone post a status. Facebook use this idea and it might be easier to scale and if someone is following 25000 people, he will see his list of status pretty quickly and your server wont have to "fight" to retrieve the data quickly.

You will have a user collection, each user will have a statuses array. Let say you have user1 and user2, and that user1 follow user2. When user2 push a status, his status will be saved in user1 array of statuses AND in user2 array of statuses. You will use more storage which with mongoDB mean more memory.... At Facebook they are using Hadoop with HBase for the main storage then they have huge arrays of servers with lots of memory.

One inconvenient is if you delete one status you have to delete it everywhere... Major advantage to this solution, each user will have an array of statuses already in order! In the previous solution if you follow 3users, you need to grab all their feeds then sort them, then render them...

[Edit] Like Shekhar point out int the comment, Mongo has a document Limit. You need to create a status collection an save the status twice, once for user2 and once for user1 and need to have a fromId, toId, status, and time

zzarbi
  • 1,832
  • 3
  • 15
  • 29
  • Given that document size in MongoDB can't be greater than 16 MB. Wouldn't you be hitting the limit and finally you have to create a new collection? – Shekhar Dec 09 '11 at 18:35
  • Yes indeed its because my example was from Facebook which use HBase, for MongoDb you can create a status collection, when user2 push the status you save it twice, once for user2 and a second time for user1, your status document will contain a fromId, a toId, time and the status itself... – zzarbi Dec 09 '11 at 23:11