1

I have two sets with user sessions. Each set consists two columns:
- id of viewed items in online shop
- id of bought items in online shop

One set have to be used for train (top roducts rating), second set have to be used for test.

all the id's of bought items are different.

I need to do:
1. On test set calculate the frequencies of viewed and bought id (one id can occur several times in viwed items)
2. Realize two algorithms of recommendations:
- sort viewed id's by popularity (frequency occur in viewed items)
- sort viewed id's by buying (frequency occur in bought items)
3. Using this algorithms I need to calculate AverageRecall@1, AveragePrecision@1, AverageRecall@5, AveragePrecision@5

Important:
- Sessions in which the user did not buy anything, rule of quality assessment.
- If the item is not found in the training set, its popularity is 0.
- Need to recommend different items. And its number should be no more than the number of different user-viewed items.
- Recommendations is never greater than the minimum of two numbers: the number of viewed items and the k in recall @ k / precision @ k.

First task (calculate the frequencies) I do using OrderedDict. For second task I use function:

    def apk(viwed, bought, k=1):
            if len(predicted)>k:
                predicted = predicted[:k]

            score = 0.0
            num_hits = 0.0

            for i,p in enumerate(bought):
                if p in actual and p not in bought[:i]:
                    num_hits += 1.0
                    score += num_hits / (i+1.0)

            if not viewed:
                return 0.0

            return score / min(len(viewed), k)

    def mapk(actual, predicted, k=10):
        return np.mean([apk(a,p,k) for a,p in zip(actual, predicted)])

But I dont know how to calculate third task (average recall etc for every k) and what to do with OrderedDict.

Alex Savin
  • 215
  • 1
  • 4
  • 12
  • 1
    `@` has a meaning in python that's not what you're using it for. Could you please edit your post to clarify? – Patrick Haugh Oct 26 '16 at 15:33
  • Why meaning is not what I need to use? I found followed formulas: Recall= (Relevant_Items_Recommended in top-k) / (Relevant_Items) Precision= (Relevant_Items_Recommended in top-k) / (k_Items_Recommended) – Alex Savin Oct 26 '16 at 15:54
  • `@` is the infix matrix multiplication operator: https://www.python.org/dev/peps/pep-0465/ – Patrick Haugh Oct 26 '16 at 15:55

0 Answers0