1

I want to share all about our case.

We have Hadoop cluster with 2 name nodes, one active name node, and one standby name node.

After some time we notice that the active name node and secondary name node are down for 3 days.

After reviewing the name node log files, we see that the secondary name node was down for 1 month, and the active name node is down for a couple hours.

The other interesting thing that we see on the active name node log is name node heap size problem, as maybe some of you know is the secondary name node actually support the active name node , but secondary name node isn't replace the active name node.

Therefore we guess that the reason that active name node failed is because the active name node did not get data acknowledge from the secondary name node, and maybe it is the reason of high JVM consuming from the active name node.

enter image description here

I will appreciate stack-overflow users help , and your opinion about our case

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
jessica
  • 2,426
  • 24
  • 66
  • In this configuration NN is a SPOF, and as soon as it goes down the whole cluster becomes unavailable. So I am surprised you have noticed it only after 3 days. My opinion - you should set up HA for your namenode. Anyways, whats the question? – mazaneicha Apr 25 '20 at 21:30
  • standby name node goes down first , so even if HA is working then active is still up until the fail , – jessica Apr 26 '20 at 04:52

0 Answers0