Given a csv file, where each line contains a set of number, i want to write a map reduce program which determines the maximum number of all numbers in the file. lets say the csv file is 3,4 5,6 the script should return 6.
from mrjob.job import MRJob
class MRWordCounter(MRJob):
def mapper(self, key, line):
for word in line.split():
yield word, 1
def reducer(self, word, occurrences):
yield word, sum(occurrences)
if __name__ == '__main__':
MRWordCounter.run()
Now this script i found returns the occurences, but does not work if you have multiple values in each line. How could i parse all the data on the csv file and return the maximum?
UPDATE:
so the input file that i tried to parse as a test is something this:
1,1,1,1
2
3
4
5
6
after changing the line.split() into line.split(",") it counted all the occurrences normally.