In statistics, a percentile (or centile) is the value of a variable below which a certain percent of observations fall.
Questions tagged [percentile]
739 questions
4
votes
1 answer
How to select observations that are within a certain quantile
I have data (~1000 rows) that look like this:
head(data)
alt alb alp alt_zscore alb_zscore alp_zscore
1 11 2.60 9 -1.54 -7.82 -0.949
2 12 5.37 86.3 -1.45 …

burphound
- 161
- 7
4
votes
3 answers
How to bin ordered data by percentile for each id in R dataframe [r]
I have dataframe that contains 70-80 rows of ordered response time (rt) data for each of 228 people each with a unique id# (everyone doesn't have the same amount of rows). I want to bin each person's RTs into 5 bins. I want the 1st bin to be their…

Matt
- 185
- 1
- 6
4
votes
2 answers
Show percentiles of Variable A, while the classification of percentiles is based on Variable B
I have a dataset that looks like the following:
INCOME
WEALTH
10.000
100000
15.000
111000
14.200
123456
12.654
654321
I have many more rows.
I now want to now find how much INCOME a household in a specific WEALTH percentile has.…

Jakob
- 43
- 3
4
votes
2 answers
How do I get a percentile from a list in CSharp?
I'm creating a program where I would like to get the percentile of score x out of a list(List results). I know that the formula is [(A + (0.5) B) / n] * 100 where 'A' = # of scores lower than score x, 'B' = # of scores equal to score x and 'n' =…

Chielle
- 49
- 1
- 2
4
votes
1 answer
What is Percentile in Azure metrics - Web App Slow?
I need to know what is percentile in Azure metric - Web App Slow. I am trying to analyze Web App Slow feature in Azure under Diagnosis. there are 3 legends - 50th percentile, 90th percentile, 95th percentile.

SSD
- 1,041
- 3
- 19
- 39
4
votes
0 answers
Prometheus Grafana - How to visualize particular percentile calculated over selected time period
Suppose that I have preconfigured request times buckets histograms, like the following:
{le="0.009786708"} 0
{le="12.884901886"} 103
{le="0.003844776"} 0
{le="0.008388607"} 0
{le="0.357913941"} 97
{le="0.447392426"} 97
{le="0.805306366"} …

leebake
- 61
- 1
- 4
4
votes
1 answer
Python: Matplotlib - probability plot for several data set
I have several data sets (distribution) as follows:
set1 = [1,2,3,4,5]
set2 = [3,4,5,6,7]
set3 = [1,3,4,5,8]
How do I plot a scatter plot with the data sets above with the y-axis being the probability (i.e. the percentile of the distribution in…

siva
- 2,105
- 4
- 20
- 37
4
votes
2 answers
Calculate percentiles if we have probability density function data as x and y
I have data extracted from a pdf graph where x represents incubation times and y is the density in a csv file. I would like to calculate the percentiles, such as 95%. I'm a bit confused, should I calculate the percentile using the x values only,…

sakurami
- 343
- 3
- 18
4
votes
2 answers
Pandas: how to drop the lowest 5th percentile for each indexed group?
I have the following issue with python pandas (I am relatively new to it): I have a simple dataset with a column for date, and a corresponding column of values. I am able to sort this Dataframe by date and value by doing the following:
df =…

Berti1989
- 185
- 1
- 14
4
votes
1 answer
use of frequency argument in percentile function in spark sql
I'm trying to use the percentile function in spark-SQL.
Data:
col1
----
198
15.8
198
198
198
198
198
198
198
198
198
If I use the code below the value I get of percentile is incorrect.
select percentile('col1', .05) from tblname
output:
…

Vijay Jangir
- 584
- 3
- 15
4
votes
1 answer
How to calculate a Confidence Interval using numpy.percentile() in Python
A homework question asked me to calculate a confidence interval for a mean. When I did it the traditional method and with numpy.percentile() -- I got different answers.
I think that I may be misunderstanding how or when to use np.percentile(). My…

SherbertTheCat
- 655
- 2
- 7
- 9
4
votes
1 answer
How to add a column to a PySpark dataframe which contains the nth quantile of another column in the dataframe
I have a very large CSV file which has been imported as a PySpark dataframe: df. The dataframe contains many columns including column ireturn. I want to compute the 0.99 and 0.01 percentile of this column and then add another column to the dataframe…

Monirrad
- 465
- 1
- 7
- 17
4
votes
1 answer
np.percentile returns different median from np.median when inf is present
When inf values are present in an array, under certain conditions np.percentile can return NaN as the median, whereas np.median can return a finite value.
>>> import numpy as np
>>> np.percentile([np.inf, 5, 4], [10, 20, 30, 40, 50, 60, 70, 80,…

Himanshu Poddar
- 7,112
- 10
- 47
- 93
4
votes
1 answer
How To Calculate Exact 99.9th Percentile in Splunk
Does anyone know how to exactly calculate the 99.9th percentile in Splunk?
I have tried a variety of methods as below, such as exactperc (but this only takes integer percentiles) and perc (but this approximates the result heavily).
base | stats…

user1763328
- 301
- 2
- 3
- 11
4
votes
2 answers
PhP/MySQL: how to dynamically change my (large and always changing) database
Scenario
I have a MySQL database with 10.000 rows. Setup of the database:
ID UniqueKey Name Url Score ItemValue
1 5Zvr3 Google google.com 13 X
2 46cfG Radio radio.com -20 X
3 2fg64 …

JuliusSecret
- 149
- 1
- 2
- 12