I have a mapper and reducer function as below.
from mrjob.job import MRJob
from mrjob.step import MRStep
class SortNumMoviesDesc(MRJob):
def steps(self):
return [MRStep(mapper=self.mapper_retrieve_counts, reducer = self.reducer_sort_counts)]
def mapper_retrieve_counts(self, _, line):
movie_year_comp, counts = line.split("\t")
yield movie_year_comp, counts
def reducer_sort_counts(self, key, values):
yield key, values
if __name__ == '__main__':
SortNumMoviesDesc.run()
I am trying to run this file as python sort_counts.py task1_output > task2_output.py and facing an error, object of type function is not json serializable.
Task1_output.txt looks something like this (initial few lines)
"1916 Triangle Film Corporation" 1
"1916 Wark Producing Corp." 1
"1925 Metro-Goldwyn-Mayer (MGM)" 5
"1927 Paramount Pictures" 1
"1927 Universum Film (UFA)" 17
"1929 Metro-Goldwyn-Mayer (MGM)" 1
"1929 Nero Films" 12
The intention is to sort the counts in descending order which needs a sorting logic in reducer but unable to figure out why i am facing this error and how to solve it for now.