Hadoop: two datanodes but UI shows one and Spark: two workers UI shows one

Question

I have seen lots of answers on SO and on Quora along with many websites. Some problems were solved when they configured firewall for slaves IPs, Some said it's a UI glitch. I am confused . I have two datanodes: one is pure datanode and another is Namenode+datanode. Problem is when I do <master-ip>:50075 it shows only one datanode ( that of machine which has namenode too ). but my hdfs dfsadmin -report shows I have two datanodes and after starting hadoop on my master and if I do jps on my pure-datanode-machine or slave machine I can see datanode running. Firewall on both machines is off. sudo ufw status verbose gives Status: inactive response. Same scenerio is with spark. Spark UI show worker node as the node with master node not the pure worker node.But worker is running on pure-worker-machine. Again, is this a UI glitch or I am missing something? hdfs dfsadmin -report

Configured Capacity: 991216451584 (923.14 GB)
Present Capacity: 343650484224 (320.05 GB)
DFS Remaining: 343650418688 (320.05 GB)
DFS Used: 65536 (64 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
Pending deletion blocks: 0

-------------------------------------------------
Live datanodes (2):

Name: 10.10.10.105:50010 (ekbana)
Hostname: ekbana
Decommission Status : Normal
Configured Capacity: 24690192384 (22.99 GB)
DFS Used: 32768 (32 KB)
Non DFS Used: 7112691712 (6.62 GB)
DFS Remaining: 16299675648 (15.18 GB)
DFS Used%: 0.00%
DFS Remaining%: 66.02%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Tue Jul 25 04:27:36 EDT 2017


Name: 110.44.111.147:50010 (saque-slave-ekbana)
Hostname: ekbana
Decommission Status : Normal
Configured Capacity: 966526259200 (900.15 GB)
DFS Used: 32768 (32 KB)
Non DFS Used: 590055215104 (549.53 GB)
DFS Remaining: 327350743040 (304.87 GB)
DFS Used%: 0.00%
DFS Remaining%: 33.87%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Tue Jul 25 04:27:36 EDT 2017

/etc/hadoop/masters file on master node

ekbana

/etc/hadoop/slaves file on master node

ekbana
saque-slave-ekbana

/etc/hadoop/masters file on slave node

saque-master

Note: saque-master on slaves machine and ekbana on master machine is mapped to same IP. Also UI looks similar to this question's UI

score 2 · Accepted Answer · answered Jul 25 '17 at 13:24

2

It's because of the same hostname(ekbana). So in UI it will show only one entry for the same hostname.

if you want to confirm this, just start only one datanode which is not in master. you can see entry for that in the UI.

If you started other datanode too, it will mask second entry for the same hostname.

you can change the hostname and try.

answered Jul 25 '17 at 13:24

Rahul

459
2
13

1

Thanks hadoop ui shows two datanodes, but spark ui still showing one worker(one that is master too and not the pure worker) – Saurab Jul 27 '17 at 05:09

score 0 · Answer 2 · answered Nov 20 '17 at 01:00

I also Faced similar issue, where I couldn't see datanode information on dfshealth.html page. I had two hosts named master and slave.

etc/hadoop/masters (on master machine)
master
etc/hadoop/slaves
master
slave

etc/hadoop/masters (slave machine)
master
etc/hadoop/slaves
slave

and it was able to see datanodes on UI.

Hadoop: two datanodes but UI shows one and Spark: two workers UI shows one

2 Answers2