3

My organization has a Juniper SSG20-WLAN that routes our traffic to the outside world. We've been having intermittent problems with our internet connection so I wrote up a Python script to ping the internal interface of the router, the external interface, a couple of our internal servers, the ISP router our router talks to, their upstream provider, and Google and Yahoo for good measure. It does that about every minute.

What I have found is that when our internet goes out, our Juniper router ceases responding to pings on the external interface. Everything past that is, of course, unreachable. The internal interface and our internal servers continue to echo back without interruption.

None of the counters indicate dropped packets of any type. They all look normal. The logs complain about VIP servers being unavailable but otherwise nothing indicative of network issues.

My questions are these:

  1. Does this exonerate our ISP? Or, contrawise, might a problem with the connection be causing the external interface to go down?
  2. Is there somewhere else in the SSG20, beside the system log and counters, that might help me track down info on the problem?

UPDATE: Turned out that one of the switches between my monitoring box and the router was a router itself, and occasionally diverting from the gateway to itself. Kudos to those who made suggestions along those lines. Not really sure which answer to mark as accepted, as it was really stuff in the comments that turned out to be right.

Thanks for the suggestions.

2 Answers2

1

I work for an ISP, and what I can tell you about the routers we provide for our T1s is that when the internet connection goes down, it renders the WAN and LAN interface null to pings. We don't use Junipers, but this is the case on the Cisco 1841s, samsung ubigate 1000s and netopias that we use. It has to do with the way the IP is provided and the way the routed block is provided through the WAN IP that make them unreachable without a connection to our core routers.

MaQleod
  • 503
  • 2
  • 5
  • 17
  • Thanks, but we're not on T1. Our ISP provides us an ethernet link. – Alexander Garden Sep 16 '10 at 15:25
  • Ethernet? like EoC? its all the same in the way IPs are routed. – MaQleod Sep 16 '10 at 16:41
  • (Sorry. I thought ServerFault would notify me if if you replied to my comment. Didn't know I'd have to come back to this page and check.) Their network is fiber. They have a box on our premises that converts to standard copper ethernet. Their logs from that box don't show any problems. The external interface on our Juniper router has a static IP, not dynamic. – Alexander Garden Sep 21 '10 at 14:36
  • Have you had your ISP run intrusive testing to see why the circuit keeps dropping? Are you logging errors on the WAN interface that show the alarm state when it goes down? Have you replaced all the cabling that you can concerning the bridge and your router? You need to start with some basic layer one troubleshooting at this point. – MaQleod Sep 21 '10 at 16:50
  • There's probably not much in the way of intrusive line testing since it is not a T circuit. – user48838 Sep 22 '10 at 05:57
0

How often do the drops occur? Any pattern that can be determined (time, traffic load, etc.)? Did this situation manifest itself after some period of things working correctly in the past? What type of media is your WAN interface (ethernet, T1 WIC, etc.)?

If it does happen to be ethernet, then you might check to see if it is set for auto-negotiate. If so, so you might try "hard coding" the line settings, just in case it is an auto-negotiation issue - which occurs often enough.

If it is a T1 interface, then you should start by going through the T1 stats/counters - looking for resets, FECN (forward explicit congestion notification), BECN (backward explicit congestion notification), etc. High counts with these counters may indicate issues with the carrier (need to reset LMI, LMI setting/line encoding issues, etc.).

user48838
  • 7,431
  • 2
  • 18
  • 14
  • I came on staff just a few weeks ago and we have been having the problem as long as I have been here. The ISP has a fiber network. Their equipment converts to standard ethernet before it hits our router. I've turned off auto-negotiate and we'll see if that helps. – Alexander Garden Sep 16 '10 at 15:21
  • If that does not clear the situation, then you need to attempt to capture any possible pattern (time, traffic load, etc. - if that has not already occurred). – user48838 Sep 17 '10 at 05:51
  • Yup. Didn't clear things up. We went offline several times over the weekend. Guess I'll just keep logging until some pattern manifests itself. Thanks for the suggestions. – Alexander Garden Sep 21 '10 at 14:33
  • Since it is a Ethernet hand-off, is the drops at the Ethernet interface? How are you detecting the drops? – user48838 Sep 22 '10 at 05:59
  • I have a Python script pinging a sequence of IPs in a loop. Internal IPs, including that of the router, respond to echo requests. The external interface of the router and everything upstream of it, do not. We were offline for ten minutes this afternoon. No pattern that I can see to the problem. – Alexander Garden Sep 22 '10 at 18:49
  • What about your ISP gateway? And possibly from the opposite direction as well? Do you have ethernet statics of the WAN interface with the equipment? – user48838 Sep 23 '10 at 07:16
  • Is the WAN interface set to auto-negotiate? I agree with everyone else, you should look for layer 1/2 problems. – SpacemanSpiff Dec 08 '10 at 03:27