0

One of the disk from my hadoop cluster datanode has become read only. I am not sure what caused this problem. Will removing this volume from the datanode cause data lose ?? How to handle this if i am going to face data lose?

Immanuel Fredrick
  • 508
  • 3
  • 9
  • 20

1 Answers1

1

If your hadoop cluster was having a replication factor of more than 1 (by default it is 3 for a multi-node cluster), your data must have been replicated on multiple datanodes. You can check your replication factor value (dfs.replication) in hdfs-site.xml.

So now if you remove this read-only datanode from your cluster and you have a replication factor of more than 1, then you will not face any data loss. Because your cluster will have a corresponding replica on other datanode. To balance the replicas, under-replicated blocks will be handled by hdfs automatically and subsequently hdfs will be stable.

PradeepKumbhar
  • 3,361
  • 1
  • 18
  • 31