-1

I was monitoring a new db server with SSD RAID-10 installed. It works fine for now but I was worrying someday it could reach its capability as the business goes on. So how can I know that before it's too late?

I was thinking there are some parameters I can monitor. for example wa from top, tps from iostat but I never know how much the SSD can really handle. I want to get warned before the real disaster happens. what should I do?

EDIT: Sorry for my bad English and the misunderstanding, I was talking about IO load, not life time.

yaoxing
  • 111
  • 9
  • The typical approach is to run a benchmarking tool that will give you the peak IO capabilities and then monitor if your actual and real usage patterns start approaching those limits. - If you're worried about lifetime, use the appropriate monitoring for your RAID controller. – HBruijn Mar 12 '15 at 09:56
  • @HBruijn but it's online already, I can't do it now. Is there anything I can reference? some experiential value maybe? – yaoxing Mar 12 '15 at 10:07
  • set up performance monitoring software that gathers and plots speed statistics – Skaperen Mar 12 '15 at 10:07
  • do you know the ratings of the hardware? if not then you will need to take it offline and run a benchmark – Skaperen Mar 12 '15 at 10:09
  • @Skaperen Yes I did. But how do I know the capability? for example tps keeps growing from time to time. then one day it reaches its capability and stops growing. I need to know that before it can't handle my business. – yaoxing Mar 12 '15 at 10:10
  • you need to find out your tps to i/o ratio (compare stats) then conpare your current usage i/o to the ratings or wath the ups and downs of your usage stats and be worried when you start to see the peaks level off – Skaperen Mar 12 '15 at 10:14
  • When you hit the ceiling in IO performance most applications do not fail catastrophically, they will simply slow down a bit. If you didn't measure that ceiling beforehand, you'll know it when you hit it. In your monitoring graphs that shows when your peaks will start to level out similar to what is seen in [this random graph](http://blog.dastrup.com/wp-content/uploads/2008/03/zenoss-wmi-disk-busy.gif). – HBruijn Mar 12 '15 at 10:22

2 Answers2

2

Assuming this is a hardware RAID 1+0 and there's a controller managing the devices, just let the drive fail.

Imagine this were a spinning hard disk. Do you care why or how it failed... or just that it failed?

RAID controllers use a variety of parameters to determine drive and array health. S.M.A.R.T. statistics are just one element of this... But an SSD in an array will fail and be marked offline when its time comes.

ewwhite
  • 197,159
  • 92
  • 443
  • 809
  • Sorry I didn't say it clear enough. I'm not talking about life time. I'm talking about the load. How much IO can SSD really handle? – yaoxing Mar 12 '15 at 10:02
  • 1
    Without hardware, OS and configuration details, how could we answer this? – ewwhite Mar 12 '15 at 10:09
  • I was actually asking for a way how I can calculate this, so that I don't need to ask next time. I mean like which parameters should I care about, how to convert them, etc. Is it too wide? how can I narrow it down? – yaoxing Mar 12 '15 at 10:23
1

Write capacity is one of the biggest myths surrounding SSDs, and was only really an issue with the very early drives. Most SSDs will last decades before reaching their write capacity - well beyond the useful life of any drive. See this article for more info.

That said, if you really want to check how much data has been written to an SSD, you should be able to extract this from the drive's SMART data.

Craig Watson
  • 9,575
  • 3
  • 32
  • 47