0

I have a mapper and reducer function as below.

from mrjob.job import MRJob
from mrjob.step import MRStep

class SortNumMoviesDesc(MRJob):
   def steps(self):
      return [MRStep(mapper=self.mapper_retrieve_counts, reducer = self.reducer_sort_counts)]
    
   def mapper_retrieve_counts(self, _, line):
      movie_year_comp, counts = line.split("\t")
      yield movie_year_comp, counts
   
   def reducer_sort_counts(self, key, values):
      yield key, values

if __name__ == '__main__':
   SortNumMoviesDesc.run()
    

I am trying to run this file as python sort_counts.py task1_output > task2_output.py and facing an error, object of type function is not json serializable.

Task1_output.txt looks something like this (initial few lines)

"1916 Triangle Film Corporation"    1
"1916 Wark Producing Corp." 1
"1925 Metro-Goldwyn-Mayer (MGM)"    5
"1927 Paramount Pictures"   1
"1927 Universum Film (UFA)" 17
"1929 Metro-Goldwyn-Mayer (MGM)"    1
"1929 Nero Films"   12

The intention is to sort the counts in descending order which needs a sorting logic in reducer but unable to figure out why i am facing this error and how to solve it for now.

0 Answers0