In statistics, a percentile (or centile) is the value of a variable below which a certain percent of observations fall.
Questions tagged [percentile]
739 questions
20
votes
4 answers
Understanding histogram_quantile based on rate in Prometheus
According to Prometheus documentation in order to have a 95th percentile using histogram metric I can use following query:
histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket[5m])) by (le))
Source:…

evgeniy44
- 2,862
- 7
- 28
- 51
20
votes
10 answers
Fast Algorithm for computing percentiles to remove outliers
I have a program that needs to repeatedly compute the approximate percentile (order statistic) of a dataset in order to remove outliers before further processing. I'm currently doing so by sorting the array of values and picking the appropriate…

Eamon Nerbonne
- 47,023
- 20
- 101
- 166
17
votes
9 answers
Calculate Percentile Value using MySQL
I have a table which contains thousands of rows and I would like to calculate the 90th percentile for one of the fields, called 'round'.
For example, select the value of round which is at the 90th percentile.
I don't see a straightforward way to do…

Daniel C
- 607
- 2
- 8
- 20
16
votes
2 answers
Pandas Dataframe groupby describe 8x ~slower than computing separatly
The following code summarizes numeric data using two different approaches.
The first approach uses the Dataframe().describe() and passes some specific extra percentiles.
The second approach separately computes the summary stats (mean, std, N),…

Randall Goodwin
- 1,916
- 2
- 18
- 34
16
votes
4 answers
Calculate percentile from a long array?
Given a long array of latencies which are in milliseconds, I want to calculate percentile from them. I got below method which does the work but I am not sure how I can verify whether this gives me accurate result?
public static long[]…
user5447339
15
votes
3 answers
How do I get the percentile for a row in a pandas dataframe?
Example DataFrame Values -
0 78
1 38
2 42
3 48
4 31
5 89
6 94
7 102
8 122
9 122
stats.percentileofscore(temp['INCOME'].values, 38, kind='mean')
15.0
stats.percentileofscore(temp['INCOME'].values, 38,…

bbennett36
- 6,065
- 10
- 20
- 37
14
votes
2 answers
Data structure for efficient percentile lookups?
Suppose that you have a large collection of key/value pairs, where the value is some arbitrary real number. You're interested in creating a data structure supporting the following operations:
Insert, which adds a new key/value pair to the…

templatetypedef
- 362,284
- 104
- 897
- 1,065
13
votes
2 answers
Python Pandas - how is 25 percentile calculated by describe function
For a given dataset in a data frame, when I apply the describe function, I get the basic stats which include min, max, 25%, 50% etc.
For example:
data_1 = pd.DataFrame({'One':[4,6,8,10]},columns=['One'])
data_1.describe()
The output is:
…

Gublooo
- 2,550
- 8
- 54
- 91
12
votes
2 answers
Reliably retrieve the reverse of the quantile function
I have read other posts (such as here) on getting the "reverse" of quantile -- that is, to get the percentile that corresponds to a certain value in a series of values.
However, the answers don't give me the same value as quantile for the same…

dave_in_newengland
- 251
- 2
- 8
12
votes
5 answers
vectorize percentile value of column B of column A (for groups)
For every pair of src and dest airport cities I want to return a percentile of column a given a value of column b.
I can do this manually as such:
example df with only 2 pairs of src/dest (I have thousands in my actual df):
dt src dest a b
0 …

codingknob
- 11,108
- 25
- 89
- 126
12
votes
2 answers
NumPy percentile function different from MATLAB's percentile function
When I try to calculate the 75th percentile in MATLAB, I get a different value than I do in NumPy.
MATLAB:
>> x = [ 11.308 ; 7.2896; 7.548 ; 11.325 ; 5.7822; 9.6343;
7.7117; 7.3341; 10.398 ; 6.9675; 10.607 ; 13.125 ;
…

James
- 2,635
- 5
- 23
- 30
11
votes
5 answers
Java Apache Commons getPercentile() different result that MS Excel percentile
I have an algorithm that calculates the percentile(85) with Apache Commons of a series of values (12 values), for a later evaluation with a threshold to make a decision. The result is similar to the one given by Excel, but not equal, and sometimes…

Jav_Rock
- 22,059
- 20
- 123
- 164
11
votes
3 answers
How compute the percentile in PySpark dataframe for each key?
I have a PySpark dataframe consists of three columns x, y, z.
X may have multiple rows in this dataframe. How can I compute the percentile of each key in x separately?
+------+---------+------+
| Name| Role|Salary|
+------+---------+------+
| …

bib
- 944
- 3
- 15
- 32
11
votes
2 answers
How to print 95 and 99 Percentiles in the jmeter aggregate report command line?
I am trying to print 95 Percentile and 99 Percentile response times in the jmeter aggregate report from the command line
For this, I have tried the solution mentioned in here: Jmeter: Generating aggregate report through commnd line is not including…

user6348718
- 1,355
- 5
- 21
- 28
11
votes
4 answers
How to calculate the percentile?
I have access logs such as below stored in a mongodb instance:
Time Service Latency
[27/08/2013:11:19:22 +0000] "POST Service A HTTP/1.1" 403
[27/08/2013:11:19:24 +0000] "POST Service B…

user2574093
- 111
- 1
- 4