0

I am trying to execute MRJob on hadoop cluster using windows command. It is working when I write :

Python C:\Users\salha\Documents\Thesis\Implementation\Jacobi_2classes.py 
C:\Users\salha\Documents\Thesis\Implementation\x.txt 
C:\Users\salha\Documents\Thesis\Implementation\b.txt
C:\Users\salha\Documents\Thesis\Implementation\matrix.txt

Here is the command that I wrote:

Python C:\Users\salha\Documents\Thesis\Implementation\Jacobi_2classes.py -r hadoop --hadoop-streaming-jar "C:\hadoop-2.9.1\share\hadoop\tools\lib\hadoop-streaming-2.9.1.jar"  C:\Users\salha\Documents\Thesis\Implementation\x.txt 

C:\Users\salha\Documents\Thesis\Implementation\b.txt 

C:\Users\salha\Documents\Thesis\Implementation\matrix.txt

here what I got:

C:\Users\salha\Anaconda3\lib\site-packages\numpy\__init__.py:140: UserWarning: mkl-service package failed to import, therefore Intel(R) MKL initialization ensuring its correct out-of-the box operation under condition when Gnu OpenMP had already been loaded by Python process is not assured. Please install mkl-service package, see http://github.com/IntelPython/mkl-service
  from . import _distributor_init
No configs found; falling back on auto-configuration
No configs specified for hadoop runner
Looking for hadoop binary in C:\hadoop-2.9.1\bin\bin...
Looking for hadoop binary in $PATH...
Found hadoop binary: C:\hadoop-2.9.1\bin\hadoop.CMD
Using Hadoop version 2.9.1
Creating temp directory C:\Users\salha\AppData\Local\Temp\Jacobi_2classes.salha.20200303.052236.139525
uploading working dir files to hdfs:///user/salha/tmp/mrjob/Jacobi_2classes.salha.20200303.052236.139525/files/wd...
Copying other local files to hdfs:///user/salha/tmp/mrjob/Jacobi_2classes.salha.20200303.052236.139525/files/
Running step 1 of 2...
  WARNING: An illegal reflective access operation has occurred
  WARNING: Illegal reflective access by org.apache.hadoop.security.authentication.util.KerberosUtil (file:/C:/hadoop-2.9.1/share/hadoop/common/lib/hadoop-auth-2.9.1.jar) to method sun.security.krb5.Config.getInstance()
  WARNING: Please consider reporting this to the maintainers of org.apache.hadoop.security.authentication.util.KerberosUtil
  WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
  WARNING: All illegal access operations will be denied in a future release
  Found 2 unexpected arguments on the command line [hdfs:///user/salha/tmp/mrjob/Jacobi_2classes.salha.20200303.052236.139525/files/wd/mrjob.zip#mrjob.zip, hdfs:///user/salha/tmp/mrjob/Jacobi_2classes.salha.20200303.052236.139525/files/wd/setup-wrapper.sh#setup-wrapper.sh]
  Try -help for more information
  Streaming Command Failed!
Attempting to fetch counters from logs...
Can't fetch history log; missing job ID
No counters found
Scanning logs for probable cause of failure...
Can't fetch history log; missing job ID
Can't fetch task logs; missing application ID
Step 1 of 2 failed: Command '['C:\\hadoop-2.9.1\\bin\\hadoop.CMD', 'jar', 'C:\\hadoop-2.9.1\\share\\hadoop\\tools\\lib\\hadoop-streaming-2.9.1.jar', '-files', 'hdfs:///user/salha/tmp/mrjob/Jacobi_2classes.salha.20200303.052236.139525/files/wd/Jacobi_2classes.py#Jacobi_2classes.py,hdfs:///user/salha/tmp/mrjob/Jacobi_2classes.salha.20200303.052236.139525/files/wd/mrjob.zip#mrjob.zip,hdfs:///user/salha/tmp/mrjob/Jacobi_2classes.salha.20200303.052236.139525/files/wd/setup-wrapper.sh#setup-wrapper.sh', '-input', 'hdfs:///user/salha/tmp/mrjob/Jacobi_2classes.salha.20200303.052236.139525/files/x.txt', '-input', 'hdfs:///user/salha/tmp/mrjob/Jacobi_2classes.salha.20200303.052236.139525/files/b.txt', '-input', 'hdfs:///user/salha/tmp/mrjob/Jacobi_2classes.salha.20200303.052236.139525/files/matrix.txt', '-output', 'hdfs:///user/salha/tmp/mrjob/Jacobi_2classes.salha.20200303.052236.139525/step-output/0000', '-mapper', '/bin/sh -ex setup-wrapper.sh python3 Jacobi_2classes.py --step-num=0 --mapper', '-reducer', '/bin/sh -ex setup-wrapper.sh python3 Jacobi_2classes.py --step-num=0 --reducer']' returned non-zero exit status 1.
OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
  • Why not use pyspark? – OneCricketeer Mar 03 '20 at 13:43
  • thank you for asking,,,, I am doing two experiments first solve the problem on MapReduce and the second on Spark and then I will compare between them. Now I am doing the first experiment which is solving the problem using MapReduce and I have to call the jobs multiple times (e.g 100 tomes) – Salha Alfarsi Mar 04 '20 at 06:00
  • Alright, well, I think your installation is misconfigured because `Can't fetch history log; missing job ID` seems like an issue. I suggest using Ambari to reinstall your Hadoop environment – OneCricketeer Mar 04 '20 at 13:35
  • Thank you for your replay, Does MRJob execute on Ambari? I wrote the code on Python using MRJob to execute multistep – Salha Alfarsi Mar 04 '20 at 16:05
  • MRJob is just a Python library that runs anywhere against YARN. Ambari is a management/installation UI for Hadoop/YARN/Hive/Spark etc – OneCricketeer Mar 04 '20 at 20:27
  • this is the first time I heard about Ambari so I did quick search on it. It should be installed on Ubuntu not windows and I am using windows 10. Thank you – Salha Alfarsi Mar 05 '20 at 08:44
  • It's a Python server process. Last I checked, it can be built and ran in windows just fine – OneCricketeer Mar 05 '20 at 14:46
  • Thank you, could you please put a link for how to download and run Ambari on Windows 10. this is what I found on https://ambari.apache.org/ "Note: Ambari currently supports the 64-bit version of the following Operating Systems: RHEL (Redhat Enterprise Linux) 7.6, 7.5, 7.4, 7.3, 7.2 CentOS 7.6, 7.5, 7.4, 7.3, 7.2 OEL (Oracle Enterprise Linux) 7.6, 7.5, 7.4, 7.3, 7.2 Amazon Linux 2 SLES (SuSE Linux Enterprise Server) 12 SP4, 12 SP3, 12 SP2 Ubuntu 14, 16 and 18 Debian 9" – Salha Alfarsi Mar 10 '20 at 08:14
  • I don't have a link. I have always just used a VM or Docker – OneCricketeer Mar 10 '20 at 12:58
  • @SalhaAlfarsi Did you ever find a solution to this issue? I have the same problem on my windows machine. – Cassova Mar 18 '21 at 22:02

0 Answers0