0

This is my first time using mrjob, however I encounter the following problems when executing the relevant python script using mrjob:

No configs found; falling back on auto-configuration
Looking for hadoop binary in /home/work/alex/tools/hadoop-client-1.5.5/hadoop/bin...
Found hadoop binary: /home/work/alex/tools/hadoop-client-1.5.5/hadoop/bin/hadoop
Creating temp directory /tmp/simrank_mr.work.20161204.050846.350418
Using Hadoop version 2
STDERR: 16/12/04 13:08:48 INFO common.UpdateService: ZkstatusUpdater to hn01-lp-hdfs.dmop.ac.com:54310 started
STDERR: mkdir: cannot create directory -p: File exists
STDERR: java.io.IOException: cannot create directory -p: File exists
STDERR:         at org.apache.hadoop.fs.FsShell.mkdir(FsShell.java:1020)
STDERR:         at org.apache.hadoop.fs.FsShell.doall(FsShell.java:1934)
STDERR:         at org.apache.hadoop.fs.FsShell.run(FsShell.java:2259)
STDERR:         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
STDERR:         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
STDERR:         at org.apache.hadoop.fs.FsShell.main(FsShell.java:2331)
Traceback (most recent call last):
  File "simrank_mr.py", line 121, in <module>
    MRSimRank.run()
  File "/home/work/.jumbo/lib/python2.7/site-packages/mrjob-0.5.6-py2.7.egg/mrjob/job.py", line 429, in run
    mr_job.execute()
  File "/home/work/.jumbo/lib/python2.7/site-packages/mrjob-0.5.6-py2.7.egg/mrjob/job.py", line 447, in execute
    super(MRJob, self).execute()
  File "/home/work/.jumbo/lib/python2.7/site-packages/mrjob-0.5.6-py2.7.egg/mrjob/launch.py", line 158, in execute
    self.run_job()
  File "/home/work/.jumbo/lib/python2.7/site-packages/mrjob-0.5.6-py2.7.egg/mrjob/launch.py", line 228, in run_job
    runner.run()
  File "/home/work/.jumbo/lib/python2.7/site-packages/mrjob-0.5.6-py2.7.egg/mrjob/runner.py", line 481, in run
    self._run()
  File "/home/work/.jumbo/lib/python2.7/site-packages/mrjob-0.5.6-py2.7.egg/mrjob/hadoop.py", line 335, in _run
    self._upload_local_files_to_hdfs()
  File "/home/work/.jumbo/lib/python2.7/site-packages/mrjob-0.5.6-py2.7.egg/mrjob/hadoop.py", line 362, in _upload_local_files_to_hdfs
    self.fs.mkdir(self._upload_mgr.prefix)
  File "/home/work/.jumbo/lib/python2.7/site-packages/mrjob-0.5.6-py2.7.egg/mrjob/fs/composite.py", line 76, in mkdir
    return self._do_action('mkdir', path)
  File "/home/work/.jumbo/lib/python2.7/site-packages/mrjob-0.5.6-py2.7.egg/mrjob/fs/composite.py", line 63, in _do_action
    raise first_exception
IOError: Could not mkdir hdfs:///user/work/alex/tmp/cluster/mrjob/tmp/tmp/simrank_mr.work.20161204.050846.350418/files/

Anyone knows how to solve this problem? Many thanks!

Joey
  • 21
  • 2
  • Hadoop does not like creating files when files exist - and the error log shows where the error is, can you not just remove the file that is located there? ps. you should really use configuration file setup since you can always be sure of how your system behaves. – Chinny84 Dec 04 '16 at 06:01
  • Thank you for your reply! The strangest thing is that the hdfs path: hdfs:///user/work/alex/tmp/cluster/mrjob/tmp/tmp/simrank_mr.work.20161204.050846.350418/files/ exsits after calling self._do_action('mkdir', path), however it always show "mkdir fail problem". I am wondering is it a bug of mrjob? – Joey Dec 04 '16 at 06:12
  • For what it's worth, I experienced similar issue. I fixed by `hadoop fs -mkdir tmp` and then `hadoop fs -chmod 777 tmp` – openwonk Jan 21 '18 at 05:09
  • restarting all the services with ambari login fixed the issue – Skanda Aug 20 '22 at 15:55

0 Answers0