0

I am running one of the examples (pi) that came with Hadoop. The program doesn't respond, as it looks like it gets no response back due to connection with HDFS maybe?

yarn jar hadoop/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar pi 10 100

16/07/27 06:32:38 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
16/07/27 06:32:38 INFO input.FileInputFormat: Total input paths to process : 10
16/07/27 06:32:38 INFO mapreduce.JobSubmitter: number of splits:10
16/07/27 06:32:38 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1469626018898_0001
16/07/27 06:32:39 INFO impl.YarnClientImpl: Submitted application application_1469626018898_0001
16/07/27 06:32:39 INFO mapreduce.Job: The url to track the job: http://IP_ADDRESS/proxy/application_14696260188001/
16/07/27 06:32:39 INFO mapreduce.Job: Running job: job_1469626018898_0001

I do telnet IP_ADDRESS 9000 and connection was successful.

I did already setup hdfs-site.xml with the following (to listen on both private and public addresses):

<property>
   <name>dfs.namenode.rpc-bind-host</name>
   <value>0.0.0.0</value>
</property>

And core-site.xml is setup with:

<property>
   <name>fs.defaultFS</name>
   <value>hdfs://IP_ADDRESS:9000</value>
</property>

Any ideas why Yarn job looks like its not reaching HDFS service and thereby not completing?

nikk
  • 2,627
  • 5
  • 30
  • 51
  • I don't see any error in your log, what do you mean by does not response ? The output just stops there until you kill it ? Can you check if the following folder was created in your HDFS: `/user//QuasiMonteCarlo_/in` – Nicomak Jul 27 '16 at 07:00
  • The output you see is the output from the yarn client itself. For the actual Hadoop DFS log, I had opened the log for either datanode or namenode, or something like that (don't remember exactly which one) -- I think I remember it contained something like `connection refused.. retying after...`. The connection was to port `9000`. Anyways, `hadoop dfs -ls /user//QuasiMonteCarlo_/in` has `part0`, `part1` ... `part9`, each with 118 bytes of data. – nikk Jul 27 '16 at 07:27
  • It will be good to know when job is done, and the yarn client saying so, instead of user killing it because it never updated and stays at `INFO mapreduce.Job: Running job: job_1469626018898_0001`. – nikk Jul 27 '16 at 07:36

0 Answers0