6

I came in to my office this morning and the entire network was 'down'; meaning that no computer could establish connectivity to another computer on our network, and no one could get out to the internet.

The network consists of a few unmanaged netgear switches and a sonicwall firewall. I finally tracked the problem down to one specific machine: our Windows 2003 server running as a primary domain controller. When it is plugged into the network no one can ping anyone else, but when it is unplugged, the entire network functions properly. I tried disabling the network adapter in order to use the machine's second network adapter. It wouldn't disable though. I restarted the server and tried the original bad adapter with the same results. I was finally able to disable the first network adapter and enable the second one with the same IP config. Voila, everything is up and running just fine.

My question: has anyone ever run into something like this and what could cause it? What sort of things should I Look for? Unfortunately I can't really troubleshoot with the first adapter more during the work day, so I have time to compile troubleshooting tips and try after hours.

senorsmile
  • 713
  • 8
  • 20
  • If you like the answers you're getting, start accepting them as answers. It's the best way to say thank you – Nixphoe Jun 16 '11 at 17:30
  • OK. sorry about that. – senorsmile Jun 16 '11 at 20:55
  • Very similar thing happened to me with similar config, network of about 60 pc's and 20 servers sonicwall firewall 4 unmanaged asus switches, one windows 2003 member server on the network resulted in the entire network going down, the event log on the windows 2003 logged an event saying the network adapter malfunctioned, unplug the server and everything is up again... strange but clearly looking at the managed switches now which would probably hold up better in the event it happens again, but then again in 22 years experience this the first I've come across it probably unlikely to ever happen a –  May 02 '12 at 20:06

2 Answers2

8

This is the problem with basic switches, if it had been something like a cisco catalyst or an HP procurve it may have detected erratic activity on the port and shut it down, this would of course have meant you'd have no DNS if you only have one DC, and no way for users to log in, but the network would at least be effectively "up". My advice is to invest in a couple of 24 port procurve switches and give them a basic configuration, then team the NICs from your DCs and the like so that, should it happen again, the switch will shut down the faulty port.

With regards to troubleshooting the existing NIC if it's faulty hardware who knows what it's spewing out in to your network, maybe if you download something like wireshark and have it running on a client PC when you plug the other NIC back in you can see what traffic it's creating or where your packets are going. Also to test if it's a hardware or software issue you could try a linux live cd such as Ubuntu 10.10. If you load this in to RAM and test the NIC (live cds do not touch the HD so are safe to run for hardware diagnostics) you'll be able to see if it's software (your installation of windows) or hardware.

EEAA
  • 109,363
  • 18
  • 175
  • 245
Alex Berry
  • 2,307
  • 13
  • 23
  • 1
    Fabulous suggestions. This is actually the only windows server here. The rest are linux 10.04 and debian 5 servers(going to upgrade to 6 eventually). For that matter, I can boot up the server to a live arch cd and have a simple terminal only environment. Concerning managed switches, do the managed netgears have the same level of network protection? – senorsmile Jun 16 '11 at 16:38
  • 1
    I don't mean to rag on anyone, but I've used Netgear in the past and wont make that mistake again. ProCurve is the cheapest of the good gear, with excellent warranties and support. – Chris S Jun 16 '11 at 17:47
  • 1
    I'd have to agree with Chris, their SMB desktop switches are pretty solid, but they're not fit for enterprise. I don't know if netgear support it but the cisco version is called errdisable, here's an article from cisco's website explaining the technology: http://www.cisco.com/en/US/tech/tk389/tk621/technologies_tech_note09186a00806cd87b.shtml – Alex Berry Jun 16 '11 at 17:54
  • With switches you get what you pay for. We have quite a few of the NetGear switches in operation as desktop switches, and that's OK. Our network switches are all HP ProCurve or 3Com. If one of the NetGear switches fails, I don't even bother to figure out what's wrong. – wolfgangsz Jun 16 '11 at 21:39
1

If you want to troubleshoot it in a more "constrained" environment, hook it up to another machine running Wireshark or similar with a crossover cable. That way you're not risking your network.

Vatine
  • 5,440
  • 25
  • 24