I'm wondering the following in regard to the Datanode disks setup in Hadoop cluster. Which of the those two options is better:
To add one (or few) disks to the Datanode, and attach more after they start to fill in.
Or to start with as many disks as possible from the beginning and to fill them at the same time.
Two other related questions: It is best to get as big drives as possible in order to obtain maximum capacity for the limited number of drive slots?
How much storage can a single Datanode support? (of course it depends on the Datanode hardware specification, but still... any approximate limit?)