Show IO on Netapp

Question

I think I might be hitting the IO limits of what my Netapp can deliver, as I have been adding more servers to my cluster and iowait has gone up on each server.

However, how do I quantify this? How can I use Netapp CLI tools to view current IO stats? I am aware of "stats show" but not seeing an "io" object or similar. How do I know what the Netapp is supposed to be able to deliver?

If anyone has more experience with Netapp than I, I would greatly appreciate the help.

Thanks!

It would be helpful if you could tell us whether you're using 7-mode or CDOT. — Basil, Jan 16 '15 at 20:45
Netapp has 2 operating systems- clustered data ontap and 7-mode data ontap. They're fundamentally different, and the answer for one will not be the answer for the other. — Basil, Jan 16 '15 at 21:14

score 1 · Answer 1 · edited Aug 12 '15 at 15:30

1

Check out the My AutoSupport part of the netapp support site. It has performance data you can analyze, as well as some health checks.

edited Aug 12 '15 at 15:30

brandeded

1,845
8
32
50

answered Jan 16 '15 at 21:40

Basil

8,851
3
38
73

score 1 · Answer 2 · answered Jan 18 '15 at 00:30

Thee are several options to monitor performance of NetApp filer. It depends on version of DataOntap. Just execute sysconfig and you will see version. You can use OnCommand Performance manager as GUI tool for clustered Ontap. Another option for clustered Ontap is QoS as performance monitor. For 7-mode you can use systat or statit console commands.

score 1 · Answer 3 · edited Jun 11 '20 at 10:02

This answer only applies to 7-mode - I have no experience with cluster mode.

With performance problems, there is simply no easy answers.

You have counters for iops, that you can show with sysstat -x.

stats show system will give you something similar - a list of NFS/FCP/CIFS ops etc.

On their own though, these things are fairly arbitrary - how do you know how many IOPs it 'too many'?

The thing I find a most useful indicator is looking at consistency points. Again, back to the sysstat -x. The way filers do write IO is they fill an NVRAM cache. This cache is flushed periodically, and data is written to disk in bursts.

What type of consistency point occurred is a good indicator of whether your system is 'happy'. https://kb.netapp.com/support/index?page=content&id=3014024

T means your system is idle. (triggered by timer - not much happened for 10s, so it thought it better destage anyway)
S or Z is a 'forced' cp because of a snapshot/snapmirror op. (and usually isn't a problem)
F or H or L means your system is getting busy.  (F is nvram filling with write data, H/L represent high and low watermarks for memory)
B or b means your system is struggling. (Back to back CPs, which means your hitting the limits of your ability to write to disk.

This is almost entirely about write IO though. Another reason your system can be struggling is read IO. Writes can easily be cached; reads must be fetched immediately - and only in some cases can they be cached.

Your stats show counter will give you disk_data_read and disk_data_written. sysstat -x will give you the same, and a notion of disk utilisation. (But be warned - that utilisation is 'cross system' so won't show you if you have one really hot aggregate averaged with a 'cold' one).

You can also run stats show volume to get per-volume IO stats. This will give you an idea of total of reads/writes, and which volume they're going to. It also distinguishes between 'read' 'write' and 'other'. 'other' can be quite significant, and problematic.

score 0 · Answer 4 · answered Aug 25 '16 at 13:31

0

Netapp also provides a tool called perfstat who can collect data in order to troubleshot performance and I/O issues :

https://kb.netapp.com/support/index?page=content&id=1013882

answered Aug 25 '16 at 13:31

bgtvfr

1,262
10
20

score 0 · Answer 5 · answered Aug 25 '16 at 18:18

Well, I guess you executed io-stats and see "iowait" on server-side and made the this conclusion "Netapp may be to slow". If you now look to Netapp you will find everything and nothing to prove you theory.I promises you.
Not because of not enough information out of the Netapp storage. But if you not know what you are look for you will not come to the point of a problem (if there is a problem/performance issue related to the storage)
Therefor I would suggest another approach: look from server to storage - foolow the I/O flowFirst of all how are the server's connected ? Fibre-Channel SAN ? NFS/iSCSI (IP based) ?
Check at what time you see "iowait" and do you see "iowait" with no/or little io-busy ? and with low LUN-utilizaion ? --> may this be related to running backup ?
What server are connected ? Most VMWare ?
How is the I/O characteristics (read/write) ration?
Could there be problem with unaligned I/O ?
How is the I/O queue configured on server-side ?
You should analyses from server to storage, not vice versa. Start with a clear picture of you configuration / storage topology. This would also help us to give you more ideas for checking if there is a (storage) issues and where is it located.

score 0 · Answer 6 · answered Aug 25 '16 at 18:36

0

The Performance Advisor tool that comes with OnCommand Unified Manager is what you'd want. This software is free to all NetApp customers. It will monitor IOPS information at the controller, aggregate, volume and LUN level.

answered Aug 25 '16 at 18:36

Matt L.

21
3

Show IO on Netapp

6 Answers6

This answer only applies to 7-mode - I have no experience with cluster mode.