I am trying to run MapReduce from Jupyter Notebook on a dataset in u.data file, but I keep receiving an error message that says
"TypeError: 'str' object doesn't support item deletion".
How can I make the code runs successfully?
The u.data contains information like the following:
196 242 3 881250949
186 302 3 891717742
22 377 1 878887116
244 51 2 880606923
166 346 1 886397596
298 474 4 884182806
115 265 2 881171488
253 465 5 891628467
305 451 3 886324817
6 86 3 883603013
And here is the code:
from mrjob.job import MRJob
class MRRatingCounter(MRJob):
def mapper(self, key, line):
(userID, movieID, rating, timestamp) = line.split("\t")
yield rating, 1
def reducer(self, rating, occurences):
yield rating, sum(occurences)
if __name__ == "main__":
MRRatingCounter.run()
filepath = "u.data"
MRRatingCounter(filepath)
This code runs successfully if it saves under .py file, and uses a command line: !python ratingCounter.py u.data