2

I have a moderately complex network topology between my external firewall and the internet, as shown below.

Every so often - I haven't found a pattern yet - we're getting a significant degree of packet loss, around 25%. Most of the time it's under .5%. As far as I can tell the only commonality is that all of the dropped traffic is crossing the interface from vpn server Cisco ASA 5505 to gateway router, Cisco 2901.

Edit

In addition to pure dropped packets I'm also looking at response times. Any traffic from gateway router to vpn server or fiber uplink is adding exactly 200 milliseconds compared to a ping that stops one step short.

Since high ping response times are a common indicator of the CPU being maxed out I checked show process cpu, but it only shows about 40% utilization.

Any thoughts?

End Edit

Network Outline

Assuming that the problem does reside on the interface between the ASA and the 2901 I cleared the interface statistics on both devices.

Since then we've had a couple of the periods of increased packet loss. The interface statistics are below, but don't show anything out of the ordinary from my perspective - no malformed or dropped packets, interface resets, etc. The duplex and speed settings match.

What am I missing? All of this hardware is in building, with at least 100 mbps connectivity.

gateway router

show interfaces GigabitEthernet 0/0
GigabitEthernet0/0 is up, line protocol is up
  Hardware is CN Gigabit Ethernet, address is a493.4ccc.b218 (bia a493.4ccc.b218)
  Internet address is xx.xx.xx.105/28
  MTU 1500 bytes, BW 100000 Kbit/sec, DLY 100 usec,
     reliability 255/255, txload 14/255, rxload 1/255
  Encapsulation ARPA, loopback not set
  Keepalive set (10 sec)
  Full Duplex, 100Mbps, media type is RJ45
  output flow-control is unsupported, input flow-control is unsupported
  ARP type: ARPA, ARP Timeout 04:00:00
  Last input 00:00:00, output 00:00:00, output hang never
  Last clearing of "show interface" counters 00:15:51
  Input queue: 0/75/0/6427 (size/max/drops/flushes); Total output drops: 0
  Queueing strategy: fifo
  Output queue: 0/40 (size/max)
  5 minute input rate 511000 bits/sec, 401 packets/sec
  5 minute output rate 5526000 bits/sec, 590 packets/sec
     413812 packets input, 83711483 bytes, 0 no buffer
     Received 5 broadcasts (0 IP multicasts)
     0 runts, 0 giants, 0 throttles
     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
     0 watchdog, 0 multicast, 0 pause input
     600299 packets output, 695003736 bytes, 0 underruns
     0 output errors, 0 collisions, 0 interface resets
     0 unknown protocol drops
     0 babbles, 0 late collision, 0 deferred
     0 lost carrier, 0 no carrier, 0 pause output
     0 output buffer failures, 0 output buffers swapped out

vpn server

show interface ethernet 0/1
Interface Ethernet0/1 "", is up, line protocol is up
  Hardware is 88E6095, BW 100 Mbps
        Auto-Duplex(Full-duplex), Auto-Speed(100 Mbps)
        Available but not configured via nameif
        MAC address 001e.f76a.a441, MTU not set
        IP address unassigned
        215073 packets input, 247716476 bytes, 0 no buffer
        Received 7 broadcasts, 0 runts, 0 giants
        0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort
        0 L2 decode drops
        39 switch ingress policy drops
        148763 packets output, 21509818 bytes, 0 underruns
        0 output errors, 0 collisions, 0 interface resets
        0 babbles, 0 late collisions, 0 deferred
        0 lost carrier, 0 no carrier
        0 rate limit drops
        0 switch egress policy drops
Tim Brigham
  • 15,545
  • 10
  • 75
  • 115
  • You could debug in depth, but before you do start with the simple things. Replace the cable between the 2 and simply hard set the interfaces on both to 100/full. – TheCleaner May 22 '14 at 14:15
  • @TheCleaner - thanks. So you agree that the issue most likely lies in that specific link? Replacing that cable is definitely high on my to do list, but it needs to wait until after hours. I'm looking for anything definitive in the interim. – Tim Brigham May 22 '14 at 14:21
  • It's hard to say completely, but based on the info given, that's where I'd start. If that end ups not being the issue, you'd need to enable debugging at that point on both and start running traces during those times of packet loss (and maybe wireshark traces as well)... – TheCleaner May 22 '14 at 14:29
  • 1
    BTW, I have a site with the same setup (5505 to 2901), so if we need to compare some config or settings, just let me know. – TheCleaner May 22 '14 at 14:38
  • I've seen these intermittent problems caused by either bad cable or failing ports combined with link autonegotiation. It glitches and thinks it is 10Mbit half, for example. It might work fine forever just by forcing the connection type. Remember to do both ends. – Zan Lynx May 22 '14 at 16:18
  • @TheCleaner I tried the cabling, no dice. The odd thing is I'm seeing exactly 200 ms extra introduced both ways out of the 2901. Any thoughts? – Tim Brigham May 22 '14 at 21:12
  • And you changed the auto-neg on both ends to hard set? The MTU isn't set on the ASA but that shouldn't matter since the other end is 1500. Are their policies/ACLs that are causing some dropped packets? What does an updated show interface look like now on each of them (assuming you haven't cleared them again)? – TheCleaner May 22 '14 at 21:54
  • @TheCleaner - no. I'm getting the 200 milliseconds out of two different physical interfaces, one to the fiber uplink, the other to the vpn server. I didn't see how the auto negotiation would help with a different physical interface. – Tim Brigham May 23 '14 at 13:12
  • I get the 200ms overhead, but I thought the only actual packet loss was occurring between the asa and the 2901? Is that not the case? Disabling auto-neg will simply help make sure that any auto-neg algorithm issues aren't the issue. – TheCleaner May 23 '14 at 13:16
  • @TheCleaner - I do get occasional 1% packet loss from the 2901 to the fiber uplink. Seems to be at the same time as the other failures, but it's really hard to tell. My apologies for this miscommunication - it isn't reliable and it was under my initial monitoring thresholds. – Tim Brigham May 23 '14 at 13:59

1 Answers1

1

Turned out to be a bad interface on the 5505 vpn server. We recabled and things have been rock solid ever since.

Tim Brigham
  • 15,545
  • 10
  • 75
  • 115