i've searched for two days for a solution. but nothing worked.
First, i'm new to the whole hadoop/yarn/hdfs topic and want to configure a small cluster.
the message above doesn't show up everytime i run an example from the mapreduce-examples.jar sometimes teragen works, sometimes not. in some cases the whole job failed, in others the job finishes successfully. sometimes the job failes, without printing the message above.
14/06/08 15:42:46 INFO ipc.Client: Retrying connect to server: FQDN-HOSTNAME/XXX.XX.XX.XXX:53022. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
this message is print 30 times. also the port (in code example: 53022) changes with every time a job is started. if job finished succesfuly, this is print
14/06/08 15:34:20 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
14/06/08 15:34:20 INFO mapreduce.Job: Job job_1402234146062_0002 running in uber mode : false
14/06/08 15:34:20 INFO mapreduce.Job: map 100% reduce 100%
14/06/08 15:34:20 INFO mapreduce.Job: Job job_1402234146062_0002 completed successfully
if it fails,this is shown.
INFO mapreduce.Job: Job job_1402234146062_0005 failed with state FAILED due to: Task failed task_1402234146062_0005_m_000002
Job failed as tasks failed. failedMaps:1 failedReduces:0
in this case, some tasks failed. but in log files of nodemanager, datanode, resourcemanager, ... is no reason or message to find.
INFO mapreduce.Job: Task Id : attempt_1402234146062_0006_m_000002_1, Status : FAILED
Additional Information about my Configuration: used OS: centOS 6.5 Java Version: OpenJDK Runtime Environment (rhel-2.4.7.1.el6_5-x86_64 u55-b13) OpenJDK 64-Bit Server VM (build 24.51-b03, mixed mode)
yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.address</name>
<value>FQDN-HOSTNAME:8050</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.localizer.address</name>
<value>FQDN-HOSTNAME:8040</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>FQDN-HOSTNAME:8025</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>FQDN-HOSTNAME:8030</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>FQDN-HOSTNAME:8032</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.permissions </name>
<value>false </value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///var/data/hadoop/hdfs/nn</value>
</property>
<property>
<name>fs.checkpoint.dir</name>
<value>file:///var/data/hadoop/hdfs/snn</value>
</property>
<property>
<name>fs.checkpoint.edits.dir</name>
<value>file:///var/data/hadoop/hdfs/snn</value>
<name>fs.checkpoint.edits.dir</name>
<value>file:///var/data/hadoop/hdfs/snn</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///var/data/hadoop/hdfs/dn</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.cluster.temp.dir</name>
<value>/mapred/tempDir</value>
</property>
<property>
<name>mapreduce.cluster.local.dir</name>
<value>/mapred/localDir</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>FQDN-HOSTNAME:10020</value>
</property>
</configuration>
I hope somebody could help me. :) Thank you, Norman