from mrjob.job import job
class KittyJob(MRJob):
OUTPUT_PROTOCOL = JSONValueProtocol
def mapper_cmd(self):
return "grep kitty"
def reducer(self, key, values):
yield None, sum(1 for _ in values)
if __name__ == '__main__':
KittyJob().run()
Source : https://mrjob.readthedocs.org/en/latest/guides/writing-mrjobs.html#protocols
How does this code do its task of counting the number of lines containing kitty?
Also where is OUTPUT_PROTOCOL defined?