I'm writing a simple program which uses the mrjob library to map and reduce rows from a csv file.
One of the columns from a row is a yearID
. This column is by default read in as a Str. I need to convert it to an Int so that I can compare it. For some reason, the Str to Int conversion is not working and has weird behavior.
I get the follow error when I run:
ValueError: invalid literal for int() with base 10: 'yearID'
This error is caused by the line 29 if int(stat.get("yearID")) > 1990:
in the following code:
from mrjob.job import MRJob
class MRPitching(MRJob):
def mapper(self, _, line):
row = line.split(",")
playerID = row[0]
whip = {
"p_H": row[13],
"p_BB": row[16],
"p_IPOUTS": row[12],
"yearID": row[1]
}
yield playerID, whip
def reducer(self, playerID, pitchingStats):
pHSum = 0
pBBSum = 0
pIPOUTSSum = 0
for stat in pitchingStats:
if int(stat.get("yearID")) > 1990:
yield playerID, stat
if __name__ == "__main__":
MRPitching.run()
For some reason the int() function is taking in yearID
as the param when it should instead be the value of stat.get("yearID")
. When I print stat.get("yearID")
, I am seeing the expected value so I don't understand why int() is getting yearID
.