I've got an application that displays data to the user based on a relevancy score. There are 5 to 7 different types of information that I can display (e.g. Users Tags, Friends Tags, Recommended Tags, Popular Tags, etc.) Each information type would be a separate sql query.
Then I have an algorithm that ranks each type by how relevant it is. The algorithm is based on several factors including how long its been since an action was taken on a particular type, how important one information type is to another, how often one type has been shown, etc.
Once they are ranked, I show them to the users in a feed, similar to Facebook.
My question is a simple one. I need the data before I can run it through the ranking algorithm, so whats the most efficient way to pull only the data I need from the database.
Currently I pull the top 5 instances of each information type, and then rank those. Each piece of data gets a relevancy score, and if I don't have enough results that reach a certain relevancy threshold, I go back to the database for the next 5 of each.
The problem with this approach is that I risk pulling too many of one story type that I never use, and I have to keep going back to the database if I don't get what I need the first time.
I have thought about a massive sql query that incorporates all info types & algorithm, which could work, but that would really be a huge query, and them I'm having mysql do so much processing, and I'm of the general mind set that Mysql should do the data retrieval and my programming language (php) should do the processing stuff.
There has to be a better way! I'm sure there is a scholarly article somewhere, but I haven't been able to find it.
Thanks Stack Overflow