0

I have a dataset like:

cid_int   item_id   score
  1         678      0.5
  2         787      0.6
  3         908      0.1
  .          .        .
  .          .        .

Now I'm running ALS model on this pyspark dataframe for getting recommendation using Collaborative Filtering.

als = ALS(userCol= "cid_int", itemCol= "item_id", ratingCol= "score", rank=5, maxIter=10, seed=0)
model = als.fit(X_train)

Now I have question that what does output of model.userFactors returns, does it return item embeddings like for m items I'll get all the embeddings?

And if yes can I use KNN on these embedding to find the closest items to given item?

Chris_007
  • 829
  • 11
  • 29

1 Answers1

1

Yes, model.userFactors will return embeddings for users. In your case, these will be vectors of dimension 5.

Yes, you can use these embeddings for KNN model. If the KNN model will perform poorly, try to increase the rank value - this will increase the dimension of the vectors.

Danylo Baibak
  • 2,106
  • 1
  • 11
  • 18
  • Hi, do you have idea how can I validate this KNN recommendation, like any metric or or any concept I can use to evaluate my CF and KNN – Chris_007 Aug 31 '22 at 19:20
  • 1
    For the recommendation systems, quite often people use metrics such as: "mean average precision at k", "mean average recall at k", "item coverage". – Danylo Baibak Sep 01 '22 at 06:26