3

I am running the tutorial in the doc and the word count is working for local files, but then I try

     python mr.py -r hadoop 1.txt

Then it hangs.

When I keyboard interrupt it, the log is:

no configs found; falling back on auto-configuration
no configs found; falling back on auto-configuration
creating tmp directory /var/folders/zv/1hqhxh0n6m374cwzysmdn6zc0000gn/T/mr.yd006t.20150508.194506.047719
writing wrapper script to /var/folders/zv/1hqhxh0n6m374cwzysmdn6zc0000gn/T/mr.yd006t.20150508.194506.047719/setup-wrapper.sh
Using Hadoop version 2.7.0
Copying local files into hdfs:///user/yd006t/tmp/mrjob/mr.yd006t.20150508.194506.047719/files/
^CTraceback (most recent call last):
  File "mr.py", line 16, in <module>
    MRWordFrequencyCount.run()
  File "/Library/Python/2.7/site-packages/mrjob/job.py", line 461, in run
    mr_job.execute()
  File "/Library/Python/2.7/site-packages/mrjob/job.py", line 479, in execute
    super(MRJob, self).execute()
  File "/Library/Python/2.7/site-packages/mrjob/launch.py", line 151, in execute
    self.run_job()
  File "/Library/Python/2.7/site-packages/mrjob/launch.py", line 214, in run_job
    runner.run()
  File "/Library/Python/2.7/site-packages/mrjob/runner.py", line 464, in run
    self._run()
  File "/Library/Python/2.7/site-packages/mrjob/hadoop.py", line 237, in _run
    self._run_job_in_hadoop()
  File "/Library/Python/2.7/site-packages/mrjob/hadoop.py", line 339, in _run_job_in_hadoop
    self._process_stderr_from_streaming(master)
  File "/Library/Python/2.7/site-packages/mrjob/hadoop.py", line 388, in _process_stderr_from_streaming
    for line in treat_eio_as_eof(stderr):
  File "/Library/Python/2.7/site-packages/mrjob/hadoop.py", line 381, in treat_eio_as_eof
    yield iter.next()  # okay for StopIteration to bubble up
KeyboardInterrupt

And this is the thing in the mr.py

from mrjob.job import MRJob


class MRWordFrequencyCount(MRJob):

    def mapper(self, _, line):
        yield "chars", len(line)
        yield "words", len(line.split())
        yield "lines", 1

    def reducer(self, key, values):
        yield key, sum(values)


if __name__ == '__main__':
    MRWordFrequencyCount.run()
kevin ding
  • 39
  • 2

0 Answers0