0

We're evaluating the options of setting up a large hadoop cluster. Now we actually have the choice to choose from these 3 setups:

  • 300x server with 12x 1TB disk
  • 150x server with 12x 2TB disk
  • 100x server with 12x 3TB disk

The other server specifications are identical. What would be your choice, and more important, why?

Best regards, Robin

RobinUS2
  • 131
  • 5
  • 2
    There's nothing inherently wrong with the 3TB (or 4TB disks) over the 1TB but I'm worried about what RAID level you're planning on using - PLEASE say it's R10 or R6 with that many consumer disks and not R5. – Chopper3 Sep 26 '12 at 11:43
  • What metrics are YOU using to make your decision? – MikeyB Sep 26 '12 at 13:06
  • @Chopper3: who says they're consumer disks? (OK, they probably are). Also, with Hadoop you throw individual disks at it and let Hadoop handle the replication. No RAID. – MikeyB Sep 26 '12 at 13:08
  • In hadoop we use NO RAID, just as @MikeyB suggests. – RobinUS2 Sep 26 '12 at 13:26
  • @MikeyB are there any 3TB disks with 100% duty cycle available yet? I thought the first 2TB ones that are at least supported to that level only just came out 3-6 months ago? Happy to be wrong. – Chopper3 Sep 26 '12 at 13:54
  • @Chopper3: Here's one: http://www.hgst.com/internal-drives/enterprise/ultrastar/ultrastar-7k3000 – MikeyB Sep 26 '12 at 14:32

1 Answers1

1

The more servers you have, the more horsepower you have. They all have identical capacity, however only someone with knowledge of what will be done with this cluster can decide between the options.

edit: I'm including disk IO horsepower in this. The more disks you have, the larger the number of random IO/s you can push, as well as the higher MB/s you can push under a sequential workload. Each spindle (disk) adds a linear amount to the aggregate performance.

Basil
  • 8,851
  • 3
  • 38
  • 73
  • I understand that we have less CPU and less memory. But is there also a negative performance impact on the disk speed (e.g. read / write)? – RobinUS2 Sep 26 '12 at 13:27
  • Yes, you have less disk horsepower as well. I didn't differentiate, however the more heads you have, the more IO/s or MB/s you can push. – Basil Sep 26 '12 at 15:30