1

i am a newer about mrjob and hadoop, after i build my hadoop cluster, i try to use mrjob submit the job to hadoop, but unfortunatly, it failed with the error "returned non-zero exit status 256".more details as follow:

1.this is my example:

from mrjob.job import MRJob

import re

WORD_RE = re.compile(r"[\w']+")


class MRWordFreqCount(MRJob):

    def mapper(self, _, line):
        for word in WORD_RE.findall(line):
            yield (word.lower(), 1)

    def combiner(self, word, counts):
        yield (word, sum(counts))

    def reducer(self, word, counts):
        yield (word, sum(counts))


if __name__ == '__main__':
     MRWordFreqCount.run()

2. and i use this command:

python test.py -r hadoop  --python-bin=/root/.pyenv/versions/2.7.9/bin/python   ./pg20417.txt  

3. this is the result what i got:

```xml HADOOP: Job not successful!

HADOOP: Streaming Command Failed!

Job failed with return code 256: ['/diskb/dxb/code/hadoop-2.7.1/bin/hadoop', 'jar', '/diskb/dxb/code/hadoop-2.7.1/share/hadoop/tools/lib/hadoop-streaming-2.7.1.jar', '-files', 'hdfs:///user/root/tmp/mrjob/test.root.20150723.011910.649661/files/test.py#test.py,hdfs:///user/root/tmp/mrjob/test.root.20150723.011910.649661/files/setup-wrapper.sh#setup-wrapper.sh', '-archives', 'hdfs:///user/root/tmp/mrjob/test.root.20150723.011910.649661/files/mrjob.tar.gz#mrjob.tar.gz', '-input', 'hdfs:///user/root/tmp/mrjob/test.root.20150723.011910.649661/files/pg20417.txt', '-output', 'hdfs:///user/root/tmp/mrjob/test.root.20150723.011910.649661/output', '-mapper', 'sh -ex setup-wrapper.sh /root/.pyenv/versions/2.7.9/bin/python test.py --step-num=0 --mapper', '-combiner', 'sh -ex setup-wrapper.sh /root/.pyenv/versions/2.7.9/bin/python test.py --step-num=0 --combiner', '-reducer', 'sh -ex setup-wrapper.sh /root/.pyenv/versions/2.7.9/bin/python test.py --step-num=0 --reducer']

Scanning logs for probable cause of failure

Traceback (most recent call last):

File "test.py", line 25, in

MRWordFreqCount.run()

File "/root/.pyenv/versions/2.7.9/lib/python2.7/site-packages/mrjob/job.py", line 461, in run

mr_job.execute()

File "/root/.pyenv/versions/2.7.9/lib/python2.7/site-packages/mrjob/job.py", line 479, in execute

super(MRJob, self).execute()

File "/root/.pyenv/versions/2.7.9/lib/python2.7/site-packages/mrjob/launch.py", line 151, in execute

self.run_job()

File "/root/.pyenv/versions/2.7.9/lib/python2.7/site-packages/mrjob/launch.py", line 214, in run_job

runner.run()

File "/root/.pyenv/versions/2.7.9/lib/python2.7/site-packages/mrjob/runner.py", line 464, in run

self._run()

File "/root/.pyenv/versions/2.7.9/lib/python2.7/site-packages/mrjob/hadoop.py", line 237, in _run

self._run_job_in_hadoop()

File "/root/.pyenv/versions/2.7.9/lib/python2.7/site-packages/mrjob/hadoop.py", line 372, in _run_job_in_hadoop

raise CalledProcessError(returncode, step_args)

subprocess.CalledProcessError: Command '['/diskb/dxb/code/hadoop-2.7.1/bin/hadoop', 'jar', '/diskb/dxb/code/hadoop-2.7.1/share/hadoop/tools/lib/hadoop-streaming-2.7.1.jar', '-files', 'hdfs:///user/root/tmp/mrjob/test.root.20150723.011910.649661/files/test.py#test.py,hdfs:///user/root/tmp/mrjob/test.root.20150723.011910.649661/files/setup-wrapper.sh#setup-wrapper.sh', '-archives', 'hdfs:///user/root/tmp/mrjob/test.root.20150723.011910.649661/files/mrjob.tar.gz#mrjob.tar.gz', '-input', 'hdfs:///user/root/tmp/mrjob/test.root.20150723.011910.649661/files/pg20417.txt', '-output', 'hdfs:///user/root/tmp/mrjob/test.root.20150723.011910.649661/output', '-mapper', 'sh -ex setup-wrapper.sh /root/.pyenv/versions/2.7.9/bin/python test.py --step-num=0 --mapper', '-combiner', 'sh -ex setup-wrapper.sh /root/.pyenv/versions/2.7.9/bin/python test.py --step-num=0 --combiner', '-reducer', 'sh -ex setup-wrapper.sh /root/.pyenv/versions/2.7.9/bin/python test.py --step-num=0 --reducer']' returned non-zero exit status 256

4.and my enviroment is:

hadoop2.7.1
python2.7.9
wsshopping
  • 11
  • 6

0 Answers0