I am trying to run a python script using MRJob on a cluster in which I don't have admin permissions and I got the error pasted below. What I think is happening is that the job is trying to write the intermediate files to the default /tmp.... dir and since this is a protected directory to which I don't have permission to write, the job receives an error and exits. I would like to know how I can change this tmp output directory location to someplace in my local filesystem example:
/home/myusername/some_path_in_my_local_filesystem_on_the_cluster
, basically I would like to know what additional parameters I would have to pass to change the intermediate output location from /tmp/... to some place local where I have write permission.
I invoke my script as:
python myscript.py input.txt -r hadoop > output.txt
The error:
no configs found; falling back on auto-configuration
no configs found; falling back on auto-configuration
creating tmp directory /tmp/13435.1.all.q/mr_word_freq_count.myusername.20131215.004905.274232
writing wrapper script to /tmp/13435.1.all.q/mr_word_freq_count.myusername.20131215.004905.274232/setup-wrapper.sh
STDERR: mkdir: org.apache.hadoop.security.AccessControlException: Permission denied: user=myusername, access=WRITE, inode="/":hdfs:supergroup:drwxr-xr-x
Traceback (most recent call last):
File "/home/myusername/privatemodules/python/examples/mr_word_freq_count.py", line 37, in <module>
MRWordFreqCount.run()
File "/home/myusername/.local/lib/python2.7/site-packages/mrjob/job.py", line 500, in run
mr_job.execute()
File "/home/myusername/.local/lib/python2.7/site-packages/mrjob/job.py", line 518, in execute
super(MRJob, self).execute()
File "/home/myusername/.local/lib/python2.7/site-packages/mrjob/launch.py", line 146, in execute
self.run_job()
File "/home/myusername/.local/lib/python2.7/site-packages/mrjob/launch.py", line 207, in run_job
runner.run()
File "/home/myusername/.local/lib/python2.7/site-packages/mrjob/runner.py", line 458, in run
self._run()
File "/home/myusername/.local/lib/python2.7/site-packages/mrjob/hadoop.py", line 236, in _run
self._upload_local_files_to_hdfs()
File "/home/myusername/.local/lib/python2.7/site-packages/mrjob/hadoop.py", line 263, in _upload_local_files_to_hdfs
self._mkdir_on_hdfs(self._upload_mgr.prefix)