EDIT: I know it's been over a year, but I finally got something new to this problem. To see an update for this look at this question: Rails 3 user matching-algorithm to SQL Query (COMPLICATED)
I'm working on a site where users are matched based on answered questions.
The match percentage is calculated each time a user, for example, visits another users profile page. So the matching percentage is not stored in the database and is recalculated all the time.
Now I want to build in a search where users can search for their best match.
The question I have is, what is the most efficient way to do this?
What if I have 50k users and I have to list them ordered by match percentages. Do I have to calculate each matching percentage between one and the other 50k users and then create a list out of that? Sounds kind of inefficient to me. Wouldn't that slow down the app drastically?
I hope someone can help me with this, because this gives me kind of a headache.
EDIT: To clear things up a bit, here is my database model for user, questions, answers, user_answers and accepted_answers:
Tables:
Users(:id, :username, etc.)
Questions(:id, :text)
Answers(:id, :question_id, :text)
UserAnswers(:id, :user_id, :question_id, :answer_id, :importance)
AcceptedAnswers(:id, :user_answer_id, :answer_id)
Questions <-> Answers: one-to-many
Questions <-> UserAnswers: one-to-many
Users <-> UserAnswers: one-to-many
UserAnswers <-> AcceptableAnswers: one-to-many
So there is a list of Questions(with possible answers to this question) and Users give their "UserAnswers" to those questions, assign how important that question is to them and what answers they accept from other users.
Then if you take User1 and User2, you look for common answered questions, so UserAnswers where the question_id is the same. They have 10 questions in common. User1 gave the importance value 10 to the first five questions and the importance value 20 to the other five. User 2 gave acceptable answers to two 20 value and three 10 value questions. A total of 70 points. The highest reachable pointscore is of course 20x5 + 10x5... So User2 reached 70/150 * 100 = 46,66% ... The same thing is done the other way around for how much User1 reached of User2's assigned points to those questions. Those 2 percentages are then combined through the geometric mean: sqrt of percentage1 * percentage2 ... this gives the final match percentage