1

Consider the following setup:

Windows 2008 R2, MPIO feature installed. Two iSCSI NICs (not bonded), 1Gb each.
Storage: Compellent, 2x 1Gb iSCSI ports, single controller.

In my tests I have confirmed that using Round Robin MPIO, both iSCSI NICs on the host are active during a single-worker IOMETER test. Both iSCSI NICs on the storage are also active during this test. I am seeing about 50% to 60% utilization on each host NIC, and I would expect more. I am using a crappy D-Link switch at the moment and this certainly is not helping, so I'm not super concerned about this yet.

My question is this: instead of "how can I make this particular setup perform", I would like to know, more generally, if round robin (active/active) MPIO allows me to get greater than 1Gb bandwidth from the host to the storage, using a single I/O stream (like copying a file, or running a single worker IOMETER test).

If yes, why? If no, why not?

MDMarra
  • 100,734
  • 32
  • 197
  • 329
Jeremy
  • 938
  • 2
  • 7
  • 18
  • Thanks to all for the responses. What I am really hoping for is for someone to explain (a document reference is fine) whether MPIO does allow aggregate throughput on a single connection greater than the speed of a single link, and if so why or why not. – Jeremy Jan 20 '12 at 23:48

3 Answers3

4

MPIO has various policies available to it. As the Coding Gorilla points out, most of those policies allow for load balancing across multiple connections to aggregate bandwidth. Both your initiator and target have to have multiple connections for it to actually be faster than single link speed however. Round Robin is a poor choice of policy; you should be using either Weighted Distribution or Least Queue Depth.

The iSCSI SAN and Server I have here have 4 ports each and I can actually get ~3.2Gbps under fairly ideal circumstances. If you need something faster than that, you'd be looking at FC or IB.

Also, do not use Trunking/Link Aggregation/etc on iSCSI links. When one links fails the connection will fail. You must use MPIO to accomplish link redundancy.

Chris S
  • 77,945
  • 11
  • 124
  • 216
  • Hi Chris, what is your setup such that you get 3.2Gbps over 4x 1Gb ports? Also regarding multiple connections, multiple NICs on both the host and the SAN are being used actively during these tests. I guess I don't have to worry about that, then? Many thanks! – Jeremy Jan 20 '12 at 21:53
  • 2
    The servers have two dual port Broadcom BCM5709 chips in them (HP DL380G5) and the SAN is a HP MSA2300i with two Dual Port controllers. The servers run Windows and have MPIO enable with Least Queue Depth scheduling Policy. There a pair of ProCurve 2510G switches linking everything up. Everything is a pretty standard configuration. – Chris S Jan 20 '12 at 22:06
  • 1
    @ChrisS +1 for avoiding trunking with iSCSI. If you setup MPIO correctly it works great at giving you increased throughput and resiliency. – tegbains Jan 20 '12 at 22:19
  • @ChrisS, thanks for the additional info. One question, with Windows do you run two separate subnets for iSCSI traffic? – Jeremy Jan 20 '12 at 23:50
  • I do have two separate vLANs for iSCSI currently (there's two more vLANs for "normal" traffic and management as well). Both switches are configured for each vLAN; eveything is criss-cross connected (each NIC chip has two ports, one of each of the ports on each vLAN, and each connected to a different switch. The other chip is connected the same way but reversed so the vLANs for each port connect to the opposite switch). This is starting to sound really complicated when trying to describe it in text, but it all makes sense when drawn out (I think). =] – Chris S Jan 21 '12 at 15:36
  • The two separate vLANs for iSCSI aren't necessary, depending on how your network is setup. Since mine was installed we've made some changes and it's no longer necessary that they're separate, but I'm not messing with a production SAN that works perfectly well. When it gets cycled out this summer I'll create the new environment with a single iSCSI vLAN. – Chris S Jan 21 '12 at 15:39
  • @ChrisS, when you say you need multiple targets, do you need multiple targets per connection, or per LUN? Right now I have 2x 1Gb iSCSI connections, each has one initiator IP, and goes to one target IP. So two initiators and two targets. What I am seeing is that using one path will max out the NIC, while using two paths (least queue depth or round robin) gives me about 50-60% utilization on both NICs. While that is marginally better than 100% utilization on one, I know the storage can keep up with 2Gb. So I'm not sure what's wrong... what tests did you run to get 3.2Gb/sec over 4 NICs? – Jeremy Jan 26 '12 at 22:39
  • See this other [Answer](http://serverfault.com/a/168598) for the details on how to setup iSCSI MPIO. If your using standard MS MPIO and your target only has one IP, you will not be able to use MPIO to get higher throughput. Every NIC must have it's own IP (in standard MPIO) and the lest of the two sides will determine the maximum speed. There are 3rd part MPIO driver (usually from the SAN manufacturer) that may work differently. Sounds like you've got 2 IPs at the Initiator and 1 IP at the Target; you'll get a single NIC speed with redundancy at the Initiator side with that setup. – Chris S Jan 27 '12 at 03:30
1

I'm not an expert on the MPIO features and iSCSI, but from technet: (http://technet.microsoft.com/en-us/library/dd851699.aspx)

Round Robin - Load balancing policy that allows the Device Specific Module (DSM) to use all available paths for MPIO in a balanced way. This is the default policy that is chosen when the storage controller follows the active-active model and the management application does not specifically choose a load-balancing policy.

This to me says that it's simply distributing the traffic across the two and it's not going to try to push either one to it's limits in order to increase performance.

Also, from a purely networking perspective, if you have both NICs connected to the same switch, then you're not going to get more than 1Gb. Most "consumer" switches are only going to handle 1Gb of traffic max, not per port. There are higher end switches that have a better back-plane that can handle more traffic but I still doubt that you'd get much more out of them. You would be better of putting each NIC on a separate segment (ie. switch) to eliminate that potential "bottle neck".

Like I said, I'm not an expert on the subject, but that's just my initial reactions. Feel free to correct me where I'm mistaken.

Coding Gorilla
  • 1,938
  • 12
  • 10
  • 3
    "Most switches are only going to handle 1Gb of traffic max, not per port" Most switches these days have a switching fabric much higher than a single ports speed. All my switches are "full" switching fabric, meaning if it's a 24 1Gb switch, then it can forward 48Gb of traffic under ideal circumstances. – Chris S Jan 20 '12 at 21:45
  • I edited my answer a bit, I was referring to "consumer" oriented switches based on his "crappy D-link" comment. But thanks for the correction; it's been a long time since I've had to buy switches. "Back in the day" those kinds of switches were very expensive. – Coding Gorilla Jan 20 '12 at 21:50
  • DLink's cheapest **consumer** gigabit switch, the 5 port DGS-1005G, is rated for `10 Gbps switching fabric`. It's been years since I've even seen a switch available that isn't full fabric (except large modular switches, that's another story) – Chris S Jan 20 '12 at 21:51
  • Ok, I'm a little behind the times. Thanks for the correction. – Coding Gorilla Jan 20 '12 at 21:54
0

MPIO with Equallogic basically picks the best iSCSI HBA interface to leave, and the best interface on the SAN based on evaluated load. To my knowledge, you'll only get one stream per LUN, meaning you're not going to split the traffic in half over an ethernet link. so you'll never get more than 1Gbs per connection to that lun per host. Now if you have multiple LUNS, you can hit other interfaces on the SAN to balance out the the throughput. This however is based on my understanding of MPIO. Also, as mentioned, no need for link aggregation and the switch is probably not your problem (unless it has a throughput level that you're hitting ie, overcommit).

here's a good doc on getting it setup and going over the various options.

http://www.dellstorage.com/WorkArea/DownloadAsset.aspx?id=2140

Eric C. Singer
  • 2,329
  • 16
  • 17
  • I don't know exactly what MPIO driver you're using; but you'd have to using a poorly written or ancient drive for it not to able to split traffic across multiple NICs. The one built into Windows 2008+ is able to aggregate multiple NICs (not sure what the limit is, but at least 4 NICs). There are some requirements to using multiple ports, but it's nothing earth shattering. – Chris S Jan 21 '12 at 00:25
  • to be clear, when i say you can't split traffic, i mean the actual packet can't be split in half, its only going to go over one path at a time. if you have multiple connections, that will load balance individual packats aross multiple interfaces. However, only one path to a lun from a given host is used at a time. Meaning, I don't think (and could be wrong) that your going to use both links in parallel to write/read a peace of data – Eric C. Singer Jan 21 '12 at 01:16
  • Meaning 1 i/o operation, goes over one path at a time, so if you have 4 i/o operations in queu, for each operation it will choose the best path. It's not going to send 2 i/o's out one path and at the same time, send the other 2 out the other path. Now i'm saying this is the behaviour per LUN. So if i want to write to lun 1 and write to lun2, I think it would then run those operations in parallel This link gives a very basic overview of eql http://www.youtube.com/watch?v=VB2Q2mo9EEU – Eric C. Singer Jan 21 '12 at 01:17
  • also, my experience is with vmware not windows. So i'm wondering if this might be a specific vmware limitation (if this limitation actually does exist). – Eric C. Singer Jan 21 '12 at 01:53
  • You are correct that connections can not be split over multiple NICs simultaneously. It is however normal to create multiple connections to the same LUN and let MPIO split traffic over the various NICs. – Chris S Jan 21 '12 at 15:32
  • I'm wondering if its a limitatin of EQL + VMware? I'm pretty sure I've read that the only MPIO driver that does true Active/Active in VMware is EMC's powerpath. I'm going to be setting up a windows host soon with EQL's MPIO, so i'm going to try SQLio and see if i can create multiple connections to the same LUN. linked here is where i saw this info, and i remember reading similar info at other sites http://www.brentozar.com/archive/2009/05/san-multipathing-part-2-what-multipathing-does/ – Eric C. Singer Jan 21 '12 at 18:46
  • That could very well be the case. Multiple sessions are dependent on everything supporting them. I'm not very familiar with VMWare, so I can't say what it takes for it to do multiple active. – Chris S Jan 21 '12 at 20:51