0

I am trying to detect the drive failure in Datanode in a Hadoop Cluster. Cloudera Manager API don't have any specific API for that. CM API are only related to Name node or restart services. Are there any suggestions here? Thanks a lot!

  • Do you have any software that monitors your server? For example I would expect that you have some sort of monitoring software that could check for [smart messages](https://www.informaticapressapochista.com/linux/detect-hard-drive-failure-in-linux-using-s-m-a-r-t/) from your disks. See also [Check for hard disk errors / signs of failure on CentOS Server](https://serverfault.com/q/150471/416540) (though this obviously depends a lot on your disks / monitoring solutions / operating system / how fast you need to detect or predict it / ...) – Secespitus Aug 23 '19 at 08:01
  • No We don't have and we cannot instal it. We am have to develop a script that will deduct the disk failures in DataNode and alert the support team. – Dine the learner Aug 23 '19 at 08:59
  • Restrictions, such as the fact that you can't install anything or that you don't have any sort of monitoring, need to be in the question. Otherwise you will just get more people suggesting stuff that you have already excluded. What have you tried so far to get that kind of information? What operating system are you using? Is the user that needs to collect the information restricted or are you "root"? How fast do you need to detect the failure? How often does it happen? Can you just create a list of expected file systems on your server and compare that to the currently available file systems? – Secespitus Aug 23 '19 at 09:09
  • I already suggested solution of graphite + grafana and it got rejected. I am also concern by not using any tool. I think, he is only interested in custom solution (Not sure). I search in Cloudera Manager API to find any solution but nothing there related to disk failures. I need to immediately send a mail once any failures deducted. occurrence of disk not fix average is 1 at weak time. – Dine the learner Aug 23 '19 at 10:33

1 Answers1

1

If you have access to NameNode UI, the JMX page will give you this information. If you hit the JMX page directly it'll be a JSON formatted page, which can be parsed easily.

We use HortonWorks primarily, haven't touched Cloudera in a long time, but I assume that can be made available somehow.

Tony Wu
  • 21
  • 1