I have access to my services' latency metrics at all percentiles. I need to calculate the trimmed 10% mean of the service's latency now. Is there a way I can approximate the trimmed 10% mean using just the percentiles data? I understand I can simply calculate the mean using a script for the transactions between the 10th percentile and 90th percentile, but since this data is to be used directionally only, I was wondering if there is an easy hack to guesstimate it as doing it at scale would be expensive.
1 Answers
This is really more suitable for stats.stackexchange.com, but anyway you can approximate the trimmed mean or any other sample statistic given percentiles. From the percentiles, construct the equivalent histogram. Each bar has the width from one percentile to the next, and height equal to the difference of percentiles. (So if you reversed the process and added up the bars, you would get the percentiles again.)
Now with that histogram, calculate the sample statistic. The exact value is an integral. An easy approximation is to generate a number of data from the span of each bar, and then use those data to calculate the sample statistic according to the ordinary formula. The first thing to try is to just generate data equal to the midpoint of each bar, with the number of values in each bin proportional to the bar height.
I don't know a package to do this, but with this description maybe you can look it up, or work out the details.

- 16,905
- 2
- 31
- 48