I'm using Hadoop streaming to run some Python code. I have noticed that if there is an error in my Python code (in mapper.py, for example), I won't be notified about the error. Instead, the mapper program will fail to run, and the job will be killed after a few seconds. Viewing the logs, the only error I see is that mapper.py failed to run or was not found, which is clearly not the case.
My question is, is there a specific log file I can check to see actual errors that may exist in the mapper.py code? (For example, would tell me if an import command failed)
Thank you!
edit: The command used:
bin/hadoop jar contrib/streaming/hadoop-streaming.jar \ -file /hadoop/mapper.py -mapper /hadoop/mapper.py -file /hadoop/reducer.py -reducer /hadoop/reducer.py -input /hadoop/input.txt -output /hadoop/output
and the post I am referencing for which I'd like to see the errors: Hadoop and NLTK: Fails with stopwords