Questions tagged [percentile]

In statistics, a percentile (or centile) is the value of a variable below which a certain percent of observations fall.

A closely related concept is "quantile". See .

739 questions
10
votes
1 answer

Percentile rank calculation

I'm attempting to calculate the percentile rank of a score using the python statlib module. The percentileofscore function is supposed to return a value between 0 and 100, however it regularly produces numbers outside this range. An example: >> a =…
monofonik
  • 2,735
  • 3
  • 20
  • 17
10
votes
1 answer

Calculate percentile for every value in a column of dataframe

I am trying to calculate percentile for every value in column a from a DataFrame x. Is there a better way to write the following piece of code? x["pcta"] = [stats.percentileofscore(x["a"].values, i) for i in…
Praveen Gupta Sanka
  • 609
  • 1
  • 8
  • 25
10
votes
3 answers

Panda rolling window percentile rank

I am trying to calculate the percentile rank of data by column within a rolling window. test=pd.DataFrame(np.random.randn(20,3),pd.date_range('1/1/2000',periods=20),['A','B','C']) test Out[111]: A B C 2000-01-01…
user6435943
  • 209
  • 2
  • 3
  • 6
10
votes
3 answers

How to compute 99% coverage in MATLAB?

I have a matrix in MATLAB and I need to find the 99% value for each column. In other words, the value such that 99% of the population has a larger value than it. Is there a function in MATLAB for this?
sfactor
  • 12,592
  • 32
  • 102
  • 152
9
votes
4 answers

Compute percentile rank relative to a given population

I have "reference population" (say, v=np.random.rand(100)) and I want to compute percentile ranks for a given set (say, np.array([0.3, 0.5, 0.7])). It is easy to compute one by one: def percentile_rank(x): return (v
sds
  • 58,617
  • 29
  • 161
  • 278
9
votes
2 answers

calculate 95 percentile of the list values in python

I have a dictionary in my program and each of the value is a list of response times. I need to calculate the 95 percentile response time for each of these lists. I know how to calculate the average, But have no idea about 95 percentile calculation.…
Surianan
  • 183
  • 2
  • 2
  • 8
9
votes
0 answers

How to monitor 95th percentile web performance?

I want to monitor some performance percentiles (95th, 99th, etc) of a web application. Here's a blogpost about the percentile metric http://blog.catchpoint.com/2010/09/02/web_performance_metrics_best/ It's relatively easy to parse your own access…
9
votes
2 answers

PostgreSQL equivalent of Oracle's PERCENTILE_CONT function

Has anyone found a PostgreSQL equivalent of Oracle's PERCENTILE_CONT function? I searched, and could not find one, so I wrote my own. Here is the solution that I hope helps you out. The company I work for wanted to migrate a Java EE web application…
thatdevguy
  • 281
  • 1
  • 2
  • 6
8
votes
2 answers

How to sample from DataFrame based on percentile of a column?

Given a dataset like this: import pandas as pd rows = [{'key': 'ABC', 'freq': 100}, {'key': 'DEF', 'freq': 60}, {'key': 'GHI', 'freq': 50}, {'key': 'JKL', 'freq': 40}, {'key': 'MNO', 'freq': 13}, {'key': 'PQR', 'freq': 11}, {'key': 'STU',…
alvas
  • 115,346
  • 109
  • 446
  • 738
8
votes
5 answers

Convert array into percentiles

I have an array that I want to convert to percentiles. For example, say I have a normally distributed array: import numpy as np import matplotlib.pyplot as plt arr = np.random.normal(0, 1, 1000) plt.hist(arr) For each value in that array, I want…
Chris
  • 12,900
  • 12
  • 43
  • 65
8
votes
3 answers

How to calculate a percentile ranking of a column of data relative to another column using python

I have two columns of data representing the same quantity; one column is from my training data, the other is from my validation data. I know how to calculate the percentile rankings of the training data efficiently…
Doodles
  • 195
  • 1
  • 2
  • 7
8
votes
2 answers

how to filter top 10 percentile of a column in a data frame group by id using dplyr

I have the following data frame: id total_transfered_amount day 1 1000 2 1 2000 3 1 3000 4 1 1000 1 1 10000 4 2 5000 …
chessosapiens
  • 3,159
  • 10
  • 36
  • 58
8
votes
1 answer

Hive: Is there a better way to percentile rank a column?

Currently, to percentile rank a column in hive, I am using something like the following. I am trying to rank items in a column by what percentile they fall under, assigning a value form 0 to 1 to each item. The code below assigns a value from 0 to…
Charlie Haley
  • 4,152
  • 4
  • 22
  • 36
8
votes
3 answers

How should the interquartile range be calculated in Python?

I have a list of numbers [1, 2, 3, 4, 5, 6, 7] and I want to have a function to return the interquartile range of this list of numbers. The interquartile range is the difference between the upper and lower quartiles. I have attempted to calculate…
d3pd
  • 7,935
  • 24
  • 76
  • 127
7
votes
1 answer

Python-Matplotlib boxplot. How to show percentiles 0,10,25,50,75,90 and 100?

I would like to plot an EPSgram (see below) using Python and Matplotlib. The boxplot function only plots quartiles (0, 25, 50, 75, 100). So, how can I add two more boxes?
user1123132
  • 71
  • 1
  • 2
1 2
3
49 50