0

I'm wondering the following in regard to the Datanode disks setup in Hadoop cluster. Which of the those two options is better:

  1. To add one (or few) disks to the Datanode, and attach more after they start to fill in.

  2. Or to start with as many disks as possible from the beginning and to fill them at the same time.

Two other related questions: It is best to get as big drives as possible in order to obtain maximum capacity for the limited number of drive slots?

How much storage can a single Datanode support? (of course it depends on the Datanode hardware specification, but still... any approximate limit?)

mart
  • 3
  • 3
  • Wikipedia claims that HDFS is "designed to scale to tens of petabytes". https://en.wikipedia.org/wiki/Apache_Hadoop#Other_file_systems See also https://en.wikipedia.org/wiki/Apache_Hadoop#Prominent_users – user Oct 27 '16 at 15:40

1 Answers1

0

First, the number of spindles is directly correlated with the performance of your mapreduce jobs (up to a point). In general, you want to do something like 1-2 CPU cores per spindle.

Second, balancing out additional spindles after the fact can be a challenge. Only recently has code been added to do intra-datanode rebalancing across spindles. The regular balancer only does balancing across nodes, so you can still get misbalanced spindles. There is some support to tweak the block placement policy so that it levels out over time when you add new spindles, but that means new data gets written only to the spindles that are least-used.

Third, I probably wouldn't do more than 12x 6TB drives (or about 72TB per datanode) at this point. This will handle a few million blocks worth of data. Above that, you start running into a few performance issues that require cluster tuning because things like the datanode block report taking too long. A lot of this is going to depend upon your use case.

I've seen clusters with a much higher drive density, but it's taken alot of tweaking to make it work (and even then there are still issues).

Travis Campbell
  • 1,466
  • 7
  • 15