i am a newer about mrjob and hadoop, after i build my hadoop cluster, i try to use mrjob submit the job to hadoop, but unfortunatly, it failed with the error "returned non-zero exit status 256".more details as follow:
1.this is my example:
from mrjob.job import MRJob
import re
WORD_RE = re.compile(r"[\w']+")
class MRWordFreqCount(MRJob):
def mapper(self, _, line):
for word in WORD_RE.findall(line):
yield (word.lower(), 1)
def combiner(self, word, counts):
yield (word, sum(counts))
def reducer(self, word, counts):
yield (word, sum(counts))
if __name__ == '__main__':
MRWordFreqCount.run()
2. and i use this command:
python test.py -r hadoop --python-bin=/root/.pyenv/versions/2.7.9/bin/python ./pg20417.txt
3. this is the result what i got:
```xml HADOOP: Job not successful!
HADOOP: Streaming Command Failed!
Job failed with return code 256: ['/diskb/dxb/code/hadoop-2.7.1/bin/hadoop', 'jar', '/diskb/dxb/code/hadoop-2.7.1/share/hadoop/tools/lib/hadoop-streaming-2.7.1.jar', '-files', 'hdfs:///user/root/tmp/mrjob/test.root.20150723.011910.649661/files/test.py#test.py,hdfs:///user/root/tmp/mrjob/test.root.20150723.011910.649661/files/setup-wrapper.sh#setup-wrapper.sh', '-archives', 'hdfs:///user/root/tmp/mrjob/test.root.20150723.011910.649661/files/mrjob.tar.gz#mrjob.tar.gz', '-input', 'hdfs:///user/root/tmp/mrjob/test.root.20150723.011910.649661/files/pg20417.txt', '-output', 'hdfs:///user/root/tmp/mrjob/test.root.20150723.011910.649661/output', '-mapper', 'sh -ex setup-wrapper.sh /root/.pyenv/versions/2.7.9/bin/python test.py --step-num=0 --mapper', '-combiner', 'sh -ex setup-wrapper.sh /root/.pyenv/versions/2.7.9/bin/python test.py --step-num=0 --combiner', '-reducer', 'sh -ex setup-wrapper.sh /root/.pyenv/versions/2.7.9/bin/python test.py --step-num=0 --reducer']
Scanning logs for probable cause of failure
Traceback (most recent call last):
File "test.py", line 25, in
MRWordFreqCount.run()
File "/root/.pyenv/versions/2.7.9/lib/python2.7/site-packages/mrjob/job.py", line 461, in run
mr_job.execute()
File "/root/.pyenv/versions/2.7.9/lib/python2.7/site-packages/mrjob/job.py", line 479, in execute
super(MRJob, self).execute()
File "/root/.pyenv/versions/2.7.9/lib/python2.7/site-packages/mrjob/launch.py", line 151, in execute
self.run_job()
File "/root/.pyenv/versions/2.7.9/lib/python2.7/site-packages/mrjob/launch.py", line 214, in run_job
runner.run()
File "/root/.pyenv/versions/2.7.9/lib/python2.7/site-packages/mrjob/runner.py", line 464, in run
self._run()
File "/root/.pyenv/versions/2.7.9/lib/python2.7/site-packages/mrjob/hadoop.py", line 237, in _run
self._run_job_in_hadoop()
File "/root/.pyenv/versions/2.7.9/lib/python2.7/site-packages/mrjob/hadoop.py", line 372, in _run_job_in_hadoop
raise CalledProcessError(returncode, step_args)
subprocess.CalledProcessError: Command '['/diskb/dxb/code/hadoop-2.7.1/bin/hadoop', 'jar', '/diskb/dxb/code/hadoop-2.7.1/share/hadoop/tools/lib/hadoop-streaming-2.7.1.jar', '-files', 'hdfs:///user/root/tmp/mrjob/test.root.20150723.011910.649661/files/test.py#test.py,hdfs:///user/root/tmp/mrjob/test.root.20150723.011910.649661/files/setup-wrapper.sh#setup-wrapper.sh', '-archives', 'hdfs:///user/root/tmp/mrjob/test.root.20150723.011910.649661/files/mrjob.tar.gz#mrjob.tar.gz', '-input', 'hdfs:///user/root/tmp/mrjob/test.root.20150723.011910.649661/files/pg20417.txt', '-output', 'hdfs:///user/root/tmp/mrjob/test.root.20150723.011910.649661/output', '-mapper', 'sh -ex setup-wrapper.sh /root/.pyenv/versions/2.7.9/bin/python test.py --step-num=0 --mapper', '-combiner', 'sh -ex setup-wrapper.sh /root/.pyenv/versions/2.7.9/bin/python test.py --step-num=0 --combiner', '-reducer', 'sh -ex setup-wrapper.sh /root/.pyenv/versions/2.7.9/bin/python test.py --step-num=0 --reducer']' returned non-zero exit status 256
4.and my enviroment is:
hadoop2.7.1
python2.7.9