I have JavaRDD which contains userId, movieId and their ratings like this.
Rating [userId=1, movieId=2858, rating=4.0], Rating [userId=3, movieId=2858, rating=5.0], Rating [userId=12, movieId=2658, rating=5.0]
.
I want to find the top 5 movies based on the number of views. I tried googling but could not get on how to approach grouping movieId and userId in JavaRDD. I want to count how many users watched a movie and store it into Map as Map(movieId, num_of_user)
. I am new to apache spark.
Desired Output:
2858 - 2
2658 - 1
I would appreciate any similar example/link/tutorial to perform the similar operation on JavaRDD.
Update: I found similar scala based question. Can somebody have a look and , convert scala code to java code.
Thanks in advance.