We have a Windows Storage Server (2003 R2) that serves as our NAS for our company. Over the past several weeks we have begun to experience intermittent "drops" where the client briefly loses their connection to the server. This can be when attempting to access a mapped drive or a UNC path. When it happens, they will typically receive one of 3 error messages:
Folder does not exist
Directory does not exist
The specified network name is no longer available
Within a few seconds of being dropped, typically a second or third attempt is successful and everything is fine. This however, plays havoc on many of our "lights-out" production processes.
When it happens, it appears to affect all clients that are attempting to access the NAS at that moment. At one point, we had thought that we solved it by replacing a faulty hard drive in the raid array, but the issue continues and we are actually starting to see it on an another NAS with identical hardware (and of identical age). They are both end-of-life servers that should have been replaced long ago.
No logs of note have been found on the server event logs, raid logs, or switch logs. The affected clients range from Linux boxes to Windows boxes.
Any assistance or advice would be greatly appreciated. I think we are going to try out some packet analysis and see if we can see anything that way. Not sure though which tool would be good for this.
Update
I tried using NetMon, but there is so much file sharing traffic that the server cannot keep up with the packet analysis.