I am trying to use the MRJob package in python. I want to send a file(u.item) along with my code to all the nodes, so I use the configure_options function and use the add_file_option to tell python that I am going to send you a file in my command line. When I run this command: !python MostPopularMovieNicer.py --items=u.item u.data everything stops and python freezes without showing me anything.
I have already run the trace command and got this error: TypeError: a bytes-like object is required, not 'str'
from mrjob.job import MRJob from mrjob.step import MRStep
class MostPopularMovieNicer(MRJob):
def configure_options(self):
super(MostPopularMovieNicer, self).configure_options()
self.add_file_option('--items', help='Path to u.item')
def steps(self):
return [
MRStep(mapper=self.mapper_get_ratings,
reducer_init=self.reducer_init,
reducer=self.reducer_count_ratings),
MRStep(reducer = self.reducer_find_max)
]
def mapper_get_ratings(self, _, line):
(userID, movieID, rating, timestamp) = line.split('\t')
yield movieID, 1
def reducer_init(self):
self.movieNames = {}
with open("u.ITEM", encoding='ascii', errors='ignore') as f:
for line in f:
fields = line.split('|')
self.movieNames[fields[0]] = fields[1]
def reducer_count_ratings(self, key, values):
yield None, (sum(values), self.movieNames[key])
def reducer_find_max(self, key, values):
yield max(values)
if name == 'main': MostPopularMovieNicer.run()
No error is shown and python freezes. I have to exit the software to be able to run other codes.