2

Application A sends traffic to B via TCP. B using multicast (UDP) sends traffic further to C. Applications on D get this data via TCP.

A -> (TCP) -> B -> (UDP) -> C -> (TCP) -> D

A - Windows Server
B - Linux VM on ESX host
C - Linux VM on another ESX host (no load - just one vm)
D - Windows clients

When B sends traffic to C multiple drops can be observed due to the nature of the traffic (bursts) happening on NIC (ixgbe) level (rx_missed_errors directly correspond to what application on C observes).

Increasing the size of ring buffer (on C) to a maximum value (4096) makes things even worse.

All connections are 10G, traffic levels do not exceed 2Gbit/s (during bursts) (checked using sar -n DEV 1).

Questions:

  1. How can I measure bursts with less than 1 second interval?

  2. How come increasing the ring size can make things worse?

  3. Is there a way of slowing down the traffic on B so it can be handled on C without drops at NIC level? (Traffic shaping, changing tcp window size/buffer size?)

  4. Why if I replace B with a Windows host drops are not happening at all as if udp bursts are shaped in more digestible way.

  5. How else could I approach, analyze this problem?

Thanks

Swisstone
  • 6,725
  • 7
  • 22
  • 32

0 Answers0