0

we are planing to build hadoop cluster with 12 data nodes machines

when the replication factor is 3

and DataNode failed disk tolerance - 1

data nodes machines are include the disks for HDFS

since we not found the criteria for how many disks need for each data-node

we are not sure about the minimal disks that should allocated for each data node

what is the minimal disks quantity for each data node , consider that replication factor is 3

King David
  • 549
  • 6
  • 20

1 Answers1

0

Since your disk tolerance is 1 then preferably you should have at least 3 disks for HDFS because even if you lost 1 disk then you still have 2 disks running and can further tolerate 1 disk failure and 1 disk for OS and other related stuff to keep things separated.

Its always recommended by to user higher number of smaller sized disks.

Please refer following link for better understanding of storage architecture selection -

https://www.tcs.com/content/dam/tcs/pdf/technologies/bigdata/abstract/Big%20Data%20Capacity%20Planning.pdf