I have a user's
"taste" vector:
+------+--------------------+
|userId| scaledFeatures|
+------+--------------------+
| 18|[0.0,0.0,0.0,0.0,...|
| 65|[0.0,0.0023910733...|
| 96|[0.0,0.0,0.005268...|
| 121|[0.0,0.0021253985...|
| 129|[0.0,0.0029224229...|
+------+--------------------+
And movie's
content vectors:
+-------+--------------------+
|movieId| scaledFeatures|
+-------+--------------------+
| 1|[1.0,0.0,0.0,0.0,...|
| 2|[1.0,0.0,0.0,0.0,...|
| 3|[0.0,0.0,0.0,0.0,...|
| 4|[0.0,0.0,0.0,0.0,...|
| 5|[0.0,0.0,0.0,0.0,...|
+-------+--------------------+
How to take user's
taste vector by its userId
and multiply with movie's
content table in pyspark? so I can get the most similar movies by movieId
?