0

I have a user's "taste" vector:

+------+--------------------+
|userId|      scaledFeatures|
+------+--------------------+
|    18|[0.0,0.0,0.0,0.0,...|
|    65|[0.0,0.0023910733...|
|    96|[0.0,0.0,0.005268...|
|   121|[0.0,0.0021253985...|
|   129|[0.0,0.0029224229...|
+------+--------------------+

And movie's content vectors:

+-------+--------------------+
|movieId|      scaledFeatures|
+-------+--------------------+
|      1|[1.0,0.0,0.0,0.0,...|
|      2|[1.0,0.0,0.0,0.0,...|
|      3|[0.0,0.0,0.0,0.0,...|
|      4|[0.0,0.0,0.0,0.0,...|
|      5|[0.0,0.0,0.0,0.0,...|
+-------+--------------------+

How to take user's taste vector by its userId and multiply with movie's content table in pyspark? so I can get the most similar movies by movieId?

Azamat
  • 209
  • 1
  • 3
  • 10
  • your question is similar to this https://stackoverflow.com/questions/37766213/spark-matrix-multiplication-with-python.please remove question if its answers your question – Maghil vannan Aug 01 '20 at 07:44
  • its quite similar but I still can't figure it out how to implement it. – Azamat Aug 01 '20 at 07:58

0 Answers0