6

I'm using MongoDB to store user profiles, and now I want to use GridFS to store a picture for each profile.

The two ways I'm comparing linking the two documents are:

A) Store a reference to the file ID in the user's image field:

User: 
{
  "_id": ObjectId('[user_id here]'),
  "username": 'myusername',
  "image": ObjectId('[file_id here]')
}

B) Store a reference to the user in the file's metadata:

File metadata: 
{
  "user_id": ObjectId('[user_id here]')
}

I know in a lot of ways it's up to me and dependent on the particulars of the app (it'll be mobile, if that helps), but I'm just wondering if there's any universal benefit to doing it one way or the other?

Matt Stauffer
  • 2,706
  • 15
  • 20

1 Answers1

5

The answer here really depends on your application's usage pattern. My assumption (feel free to correct me) is that the most likely pattern is something like this:

Look Up User --> Find User --> Display Profile(Fetch Picture)

With this generalized use case, with method A, you find the user document, to display the profile) wich contains the image object ID and you subsequently fetch the file using that ID (2 basic operations and you are done).

Note: the actual fetching of the file from GridFS I am treating as a single logical operation, in reality there are multiple operations involved, but most of the drivers/APIs obscure this anyway.

With method B, you are going to have to find the user document, then do another query to find the relevant user_id in the file metadata collection, then go fetch the file. By my count, that is three operations (an extra find you do not have with method A).

Does that make sense?

Of course, if my assumption is incorrect and your application is (for example) image driven, then your query pattern may come up with a different answer.

Adam Comerford
  • 21,336
  • 4
  • 65
  • 85
  • Thanks--could you clarify? I assumed with Method B I could just 1) Fetch the user, and 2) Fetch the file which has the metadata with the userid of the previously retrieved user. I don't have experience with file metadata, so the particulars of this will be wrong, but I was imagining something like $grid->findOne(array('metadata'=>array('user_id'=>ObjectId('[user_id here]')))) (using PHP driver) .. is that not possible? – Matt Stauffer Jan 25 '12 at 17:37
  • when you mentioned the meta data I thought you were referring to a separate collection rather than the files collection that GridFS uses - that won't contain the user_id by default and so I defaulted to the above in my head. Of course you can add user_id to the files collection and that would allow you to query it directly. That being the case you then have to make sure to index the files collection correctly to optimize it for your query rather than using the "standard" GridFS set up - more for you to remember but just as valid. I'd still go for method A as a result but up to you :) – Adam Comerford Jan 25 '12 at 21:19
  • Thanks for your help thinking through it! – Matt Stauffer Jan 26 '12 at 04:35
  • 1
    Matt/Adam, sorry for bumping this thread. I have the same question, but I also wondered how I can retrieve a list of ALL the users and show their images at the same time in let's say, a for each loop. Should I simply follow option A (of the main post) and then put an inputstream inside the user object for each user? – Moody Apr 15 '15 at 20:44