2

I'm not very sure what consumer board to choose for such a configuration.

Im planning to build one or more "Beowulf-like cluster" (starting with one for testing), one such cluster consists of four boxes (commodity Socket-1156 + i7/875K + 2x2GB 1333) in tetrahedral Gbit-Lan topology (direct back-to-back X-link connections).

In the image below, each box named A, B, C or D has four Gbit-NIC, one pointing to the upstream Gbit-switch (thin line), three for connections to the remaining boxes (each color denotes one subnet between two NICs):


tetrahedral Gbit setup

This is meant to be a reliable €2.5K (probably cheaper) "32-node" compute server running 64-bit Linux and OpenMPI. The server starts numerical simulations on the nodes through OpenMPI, the nodes will communicate through their back-to-back connections.

The Problem: I tested a similar setup already on a "trigonal" cluster (three nodes, each has two additional PCIe-NIC and the onboard Gbit-NIC) successfull one one Board type (Gigabyte P55A-UD3R).

Another Board I tested (Gigabyte P55A-UD4) failed reproducible after some minutes under full network load (but not when in single node mode).

For the above setup, I'd like to use a board that is able to bear the brunt of four simultaneous Gbit links. From my trigonal setup I know that each NIC transfers about 50-80 MB/s any time (iftop).

  • Would the tetrahedral topology (as shown above) be possible at all?
  • Should I choose boards w/two onboard Gbit-NICSs (expensive)?
  • Can PCIe on consumer boards sustain 4 simultaneous Gbit lines?
  • Is a bunch of cheap (passive) PCIe-NICs ok?
  • Did anybody do something similar and has recommendations?

Thanks & Regards

rbo

rubber boots
  • 145
  • 7

1 Answers1

2

The quality of your NICs is likely to be the issue. Onboard networking for consumer boards is usually inexpensive Broadcom or Realtek chipsets, which are pretty horrible under real loads. Cheap consumer-grade NICs are going to have similar problems.

Standalone NICs with the better Intel chipsets are pretty well regarded.

The PCIe on the consumer boards should be able to handle that sort of bandwidth without trouble.

Buy enterprise-grade nics for this - anything else is going to give you trouble, in terms of both speed and CPU load (cheaper nics offload processing to your cpu). I would consider something like the Intel Gigabit ET2 Quad Port-Server Adapter.

http://www.intel.com/Products/Server/Adapters/Gb-EF-Dual-Port/Gb-EF-Dual-Port-overview.htm

That'll run you around $400, which isn't cheap, but will give you the performance you need.

Also, be very sure that you're not accidentally passing traffic through the switch. That's likely to cause issues at full load if you don't have an expensive enterprise grade switch. Turn on Jumbo Frames if your hardware supports it.

Paul McMillan
  • 1,219
  • 1
  • 8
  • 17
  • thank you very much for your recommendation, the boards that worked (GA-P55A-UD3R, GA-EX58-UD3R/1366) have both the onboard Realtek 8111D, the nonworking board had Realtek 8111E. Maybe that's one part of the problem. So, Intel-board w/two Intel onboard NIC + 2 Intel-PCIe would be fine, I'd guess (but not cheap). – rubber boots Oct 15 '10 at 21:45
  • 1
    Yeah, I agree, it really isn't cheap. Supermicro makes some quite nice boards with the Intel NICs built in, but server boards are very expensive. The cheap standalone Intel NICs ($30ish) aren't what you're looking for either - expect to pay $80+ per port for decent performance. It may be cheaper to get the big 4 port adapter rather than paying lots for a good NIC to be built into the motherboard. – Paul McMillan Oct 15 '10 at 21:51
  • 3
    @rubber Boots We use the intel 4-port NICs in our ESX servers. They've stood up to our loads. They're a solid product. – sysadmin1138 Oct 15 '10 at 21:53