This answer only applies to 7-mode - I have no experience with cluster mode.
With performance problems, there is simply no easy answers.
You have counters for iops, that you can show with sysstat -x
.
stats show system
will give you something similar - a list of NFS/FCP/CIFS ops etc.
On their own though, these things are fairly arbitrary - how do you know how many IOPs it 'too many'?
The thing I find a most useful indicator is looking at consistency points. Again, back to the sysstat -x
. The way filers do write IO is they fill an NVRAM cache. This cache is flushed periodically, and data is written to disk in bursts.
What type of consistency point occurred is a good indicator of whether your system is 'happy'.
https://kb.netapp.com/support/index?page=content&id=3014024
T means your system is idle. (triggered by timer - not much happened for 10s, so it thought it better destage anyway)
S or Z is a 'forced' cp because of a snapshot/snapmirror op. (and usually isn't a problem)
F or H or L means your system is getting busy. (F is nvram filling with write data, H/L represent high and low watermarks for memory)
B or b means your system is struggling. (Back to back CPs, which means your hitting the limits of your ability to write to disk.
This is almost entirely about write IO though. Another reason your system can be struggling is read IO. Writes can easily be cached; reads must be fetched immediately - and only in some cases can they be cached.
Your stats show counter will give you disk_data_read
and disk_data_written
. sysstat -x
will give you the same, and a notion of disk utilisation. (But be warned - that utilisation is 'cross system' so won't show you if you have one really hot aggregate averaged with a 'cold' one).
You can also run stats show volume
to get per-volume IO stats. This will give you an idea of total of reads/writes, and which volume they're going to. It also distinguishes between 'read' 'write' and 'other'. 'other' can be quite significant, and problematic.