0

We have a file server of Windows Server 2012 R2 on DELL PowerEdge R720, and get a very weird issue in recent days about network/disk performance. The box have a RAID-1 disk group for OS (disk0) under DELL H730P controller, and a SAN storage for Data (disk1).

Symptom 1:

Users complained they can not access file as usual.
Network response is very slow with high latency, even if we ping localhost.
NIC is working on teaming of NIC0 and NIC1.
Having about 300 shared folder clients, and 125 $IPC sessions.
Having about 400 opened files.

Symptom 2:

Drive C: (disk0 on RAID-1) might have an abnormal disk queue length, greater than 1, 
sometimes up to 2 or 3.

High latency accompanies abnormal disk queue length.

But drive c: only hold OS files, pagefile, and programs, it have 80% free sapces, 
all of the business data keeps in drive d: .

Symptom 3:

If we reboot the box, all the issues are gone.
But the problem comes again after running about one or two weeks.

We need your help/guide to do some diagnoses and find the root cause.

thanks.

  • "But drive c: only hold OS files, pagefile, and programs" - so analyze. Hint: Smells like pagefile. – TomTom Oct 30 '21 at 01:05

1 Answers1

0

In your position, i would launch performances measurements with the windows tool performances monitor

You will be able to see what process use yours disks and/or use your network ressources.

You can launch a test just after a reboot and during a long time. I always record performances measures during several weeks.

EDIT :

If you can, you would launch the perfmon during the high latencies events. Because, if you launch before and if the responsible process is not yet started, it will not be loggued in the counters.

others questions :

  • Are you sure of your RAID on disk 0 ?
  • Have you tried with teaming deactived , only one interface ?
  • Have you got audits security policies activated on your shared files which write in your events logs?
  • Have you got FSRM rule activated on your shared files ?
  • Have you got some logs generated by a software ?
  • Can you correlate (with perfomon) your network latency with the queue length ?
  • ...
  • good luck
Sorcha
  • 1,325
  • 8
  • 11
  • Thanks Sorcha, we do run perfmon to collect the data for troubleshootings, you can see we had gave some key data in the post, but all the processes look like no any abnormal network behaviors. and what we most confusing is that high network latency accompanies high disk queue length of drive C:, is there a way to figure out what happens? – Steven Wang Feb 02 '17 at 14:53