I have installed HDFS in a 12 node cluster. It is deployed in EC2(AWS) instances. All these EC2 instances have 2 network interfaces - eth0
and eth1
. eth1
has static IP address and eth0
has an IP address which changes when instances are rebooted. Lets say eth0
's IP address is 'ABC' and eth1
's IP address is 'XYZ'. In my hosts file (/etc/hosts) I have made entries for all nodes FQDN and IP addresses (the IP address of eth1
). For some reason when DataNodes try to connect to the NameNode it uses the IP address of eth0
(which is 'ABC' in this case). It shows the below error and fails.
Error in LOGS is given below:
Initialization failed for Block pool BP-1423100917-name_node_host-1544213589860 (Datanode Uuid 160d6133-54f1-4a29-a6f0-0e52c0c59708) service to NAME_NODE_HOSTNAME.net/NAME_NODE_IP:8022 Datanode denied communication with namenode because hostname cannot be resolved (ip=ABC, hostname=ABC): DatanodeRegistration(XYZ, datanodeUuid=160d6133-54f1-4a29-a6f0-0e52c0c59708, infoPort=50075, infoSecurePort=0, ipcPort=50020, storageInfo=lv=-56;cid=cluster8;nsid=2080909946;c=0)
I have tried below options to fix this issue. But it did not work.
Setting the property dfs.datanode.dns.interface
to eth1
in both DataNode and NameNode and restarted the HDFS service. Also tried changing it only for DataNode or NameNode. (hdfs-site.xml)
Setting the property dfs.namenode.datanode.registration.ip-hostname-check
to 'false' in both data nodes and NameNode and restarted the HDFS service. Also tried changing it only for DataNode or NameNode.(hdfs-site.xml)
Most of previous posts related to this error points to above mentioned parameters. But it did not work for me. Has anyone faced same issue?