I want to monitor response time of an API. I can methods like average, median and other for monitoring. But I am facing following problems with those methods:
Problem with average
if one of the request taken very high time. For example in given set average will become high due to value 1000.
S1= [ 1 , 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 1000]
Problem with Median
It will be correct value only upto 50%. For example in given set S2=[2,2,2,2,2,50,50,50,50]. median gives as value 2 but most of the user are facing slow response.
Problem with 5-95 span (http://steveakers.com/2013/08/01/span-vs-median-for-response-time-monitors/)
In above article author suggested using value uppser95-uppser5. But that will not generate alert if response time is like: s3=[50,50,50,50,50] . In this case all API are response are slow. But span 5-95 is zero.
I am thinking of using either of these two values. upper95 or (upper95+upper5)/2.
Which one will be better and why ? Is there any better method to calculate QOS ?