How to do matrix multiplication in pyspark?

Asked Aug 01 '20 at 07:21

Active Aug 01 '20 at 07:21

Viewed 132 times

I have a user's "taste" vector:

+------+--------------------+
|userId|      scaledFeatures|
+------+--------------------+
|    18|[0.0,0.0,0.0,0.0,...|
|    65|[0.0,0.0023910733...|
|    96|[0.0,0.0,0.005268...|
|   121|[0.0,0.0021253985...|
|   129|[0.0,0.0029224229...|
+------+--------------------+

And movie's content vectors:

+-------+--------------------+
|movieId|      scaledFeatures|
+-------+--------------------+
|      1|[1.0,0.0,0.0,0.0,...|
|      2|[1.0,0.0,0.0,0.0,...|
|      3|[0.0,0.0,0.0,0.0,...|
|      4|[0.0,0.0,0.0,0.0,...|
|      5|[0.0,0.0,0.0,0.0,...|
+-------+--------------------+

How to take user's taste vector by its userId and multiply with movie's content table in pyspark? so I can get the most similar movies by movieId?

asked Aug 01 '20 at 07:21

Azamat

your question is similar to this https://stackoverflow.com/questions/37766213/spark-matrix-multiplication-with-python.please remove question if its answers your question – Maghil vannan Aug 01 '20 at 07:44
its quite similar but I still can't figure it out how to implement it. – Azamat Aug 01 '20 at 07:58

How to do matrix multiplication in pyspark?

0 Answers0