Questions tagged [percentile]

In statistics, a percentile (or centile) is the value of a variable below which a certain percent of observations fall.

A closely related concept is "quantile". See .

739 questions
20
votes
4 answers

Understanding histogram_quantile based on rate in Prometheus

According to Prometheus documentation in order to have a 95th percentile using histogram metric I can use following query: histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket[5m])) by (le)) Source:…
evgeniy44
  • 2,862
  • 7
  • 28
  • 51
20
votes
10 answers

Fast Algorithm for computing percentiles to remove outliers

I have a program that needs to repeatedly compute the approximate percentile (order statistic) of a dataset in order to remove outliers before further processing. I'm currently doing so by sorting the array of values and picking the appropriate…
Eamon Nerbonne
  • 47,023
  • 20
  • 101
  • 166
17
votes
9 answers

Calculate Percentile Value using MySQL

I have a table which contains thousands of rows and I would like to calculate the 90th percentile for one of the fields, called 'round'. For example, select the value of round which is at the 90th percentile. I don't see a straightforward way to do…
Daniel C
  • 607
  • 2
  • 8
  • 20
16
votes
2 answers

Pandas Dataframe groupby describe 8x ~slower than computing separatly

The following code summarizes numeric data using two different approaches. The first approach uses the Dataframe().describe() and passes some specific extra percentiles. The second approach separately computes the summary stats (mean, std, N),…
Randall Goodwin
  • 1,916
  • 2
  • 18
  • 34
16
votes
4 answers

Calculate percentile from a long array?

Given a long array of latencies which are in milliseconds, I want to calculate percentile from them. I got below method which does the work but I am not sure how I can verify whether this gives me accurate result? public static long[]…
user5447339
15
votes
3 answers

How do I get the percentile for a row in a pandas dataframe?

Example DataFrame Values - 0 78 1 38 2 42 3 48 4 31 5 89 6 94 7 102 8 122 9 122 stats.percentileofscore(temp['INCOME'].values, 38, kind='mean') 15.0 stats.percentileofscore(temp['INCOME'].values, 38,…
bbennett36
  • 6,065
  • 10
  • 20
  • 37
14
votes
2 answers

Data structure for efficient percentile lookups?

Suppose that you have a large collection of key/value pairs, where the value is some arbitrary real number. You're interested in creating a data structure supporting the following operations: Insert, which adds a new key/value pair to the…
templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065
13
votes
2 answers

Python Pandas - how is 25 percentile calculated by describe function

For a given dataset in a data frame, when I apply the describe function, I get the basic stats which include min, max, 25%, 50% etc. For example: data_1 = pd.DataFrame({'One':[4,6,8,10]},columns=['One']) data_1.describe() The output is: …
Gublooo
  • 2,550
  • 8
  • 54
  • 91
12
votes
2 answers

Reliably retrieve the reverse of the quantile function

I have read other posts (such as here) on getting the "reverse" of quantile -- that is, to get the percentile that corresponds to a certain value in a series of values. However, the answers don't give me the same value as quantile for the same…
12
votes
5 answers

vectorize percentile value of column B of column A (for groups)

For every pair of src and dest airport cities I want to return a percentile of column a given a value of column b. I can do this manually as such: example df with only 2 pairs of src/dest (I have thousands in my actual df): dt src dest a b 0 …
codingknob
  • 11,108
  • 25
  • 89
  • 126
12
votes
2 answers

NumPy percentile function different from MATLAB's percentile function

When I try to calculate the 75th percentile in MATLAB, I get a different value than I do in NumPy. MATLAB: >> x = [ 11.308 ; 7.2896; 7.548 ; 11.325 ; 5.7822; 9.6343; 7.7117; 7.3341; 10.398 ; 6.9675; 10.607 ; 13.125 ; …
James
  • 2,635
  • 5
  • 23
  • 30
11
votes
5 answers

Java Apache Commons getPercentile() different result that MS Excel percentile

I have an algorithm that calculates the percentile(85) with Apache Commons of a series of values (12 values), for a later evaluation with a threshold to make a decision. The result is similar to the one given by Excel, but not equal, and sometimes…
Jav_Rock
  • 22,059
  • 20
  • 123
  • 164
11
votes
3 answers

How compute the percentile in PySpark dataframe for each key?

I have a PySpark dataframe consists of three columns x, y, z. X may have multiple rows in this dataframe. How can I compute the percentile of each key in x separately? +------+---------+------+ | Name| Role|Salary| +------+---------+------+ | …
bib
  • 944
  • 3
  • 15
  • 32
11
votes
2 answers

How to print 95 and 99 Percentiles in the jmeter aggregate report command line?

I am trying to print 95 Percentile and 99 Percentile response times in the jmeter aggregate report from the command line For this, I have tried the solution mentioned in here: Jmeter: Generating aggregate report through commnd line is not including…
user6348718
  • 1,355
  • 5
  • 21
  • 28
11
votes
4 answers

How to calculate the percentile?

I have access logs such as below stored in a mongodb instance: Time Service Latency [27/08/2013:11:19:22 +0000] "POST Service A HTTP/1.1" 403 [27/08/2013:11:19:24 +0000] "POST Service B…
user2574093
  • 111
  • 1
  • 4
1
2
3
49 50