Questions tagged [percentile]

In statistics, a percentile (or centile) is the value of a variable below which a certain percent of observations fall.

A closely related concept is "quantile". See .

739 questions
4
votes
1 answer

How to select observations that are within a certain quantile

I have data (~1000 rows) that look like this: head(data) alt alb alp alt_zscore alb_zscore alp_zscore 1 11 2.60 9 -1.54 -7.82 -0.949 2 12 5.37 86.3 -1.45 …
burphound
  • 161
  • 7
4
votes
3 answers

How to bin ordered data by percentile for each id in R dataframe [r]

I have dataframe that contains 70-80 rows of ordered response time (rt) data for each of 228 people each with a unique id# (everyone doesn't have the same amount of rows). I want to bin each person's RTs into 5 bins. I want the 1st bin to be their…
Matt
  • 185
  • 1
  • 6
4
votes
2 answers

Show percentiles of Variable A, while the classification of percentiles is based on Variable B

I have a dataset that looks like the following: INCOME WEALTH 10.000 100000 15.000 111000 14.200 123456 12.654 654321 I have many more rows. I now want to now find how much INCOME a household in a specific WEALTH percentile has.…
Jakob
  • 43
  • 3
4
votes
2 answers

How do I get a percentile from a list in CSharp?

I'm creating a program where I would like to get the percentile of score x out of a list(List results). I know that the formula is [(A + (0.5) B) / n] * 100 where 'A' = # of scores lower than score x, 'B' = # of scores equal to score x and 'n' =…
Chielle
  • 49
  • 1
  • 2
4
votes
1 answer

What is Percentile in Azure metrics - Web App Slow?

I need to know what is percentile in Azure metric - Web App Slow. I am trying to analyze Web App Slow feature in Azure under Diagnosis. there are 3 legends - 50th percentile, 90th percentile, 95th percentile.
SSD
  • 1,041
  • 3
  • 19
  • 39
4
votes
0 answers

Prometheus Grafana - How to visualize particular percentile calculated over selected time period

Suppose that I have preconfigured request times buckets histograms, like the following: {le="0.009786708"} 0 {le="12.884901886"} 103 {le="0.003844776"} 0 {le="0.008388607"} 0 {le="0.357913941"} 97 {le="0.447392426"} 97 {le="0.805306366"} …
leebake
  • 61
  • 1
  • 4
4
votes
1 answer

Python: Matplotlib - probability plot for several data set

I have several data sets (distribution) as follows: set1 = [1,2,3,4,5] set2 = [3,4,5,6,7] set3 = [1,3,4,5,8] How do I plot a scatter plot with the data sets above with the y-axis being the probability (i.e. the percentile of the distribution in…
siva
  • 2,105
  • 4
  • 20
  • 37
4
votes
2 answers

Calculate percentiles if we have probability density function data as x and y

I have data extracted from a pdf graph where x represents incubation times and y is the density in a csv file. I would like to calculate the percentiles, such as 95%. I'm a bit confused, should I calculate the percentile using the x values only,…
4
votes
2 answers

Pandas: how to drop the lowest 5th percentile for each indexed group?

I have the following issue with python pandas (I am relatively new to it): I have a simple dataset with a column for date, and a corresponding column of values. I am able to sort this Dataframe by date and value by doing the following: df =…
Berti1989
  • 185
  • 1
  • 14
4
votes
1 answer

use of frequency argument in percentile function in spark sql

I'm trying to use the percentile function in spark-SQL. Data: col1 ---- 198 15.8 198 198 198 198 198 198 198 198 198 If I use the code below the value I get of percentile is incorrect. select percentile('col1', .05) from tblname output: …
Vijay Jangir
  • 584
  • 3
  • 15
4
votes
1 answer

How to calculate a Confidence Interval using numpy.percentile() in Python

A homework question asked me to calculate a confidence interval for a mean. When I did it the traditional method and with numpy.percentile() -- I got different answers. I think that I may be misunderstanding how or when to use np.percentile(). My…
SherbertTheCat
  • 655
  • 2
  • 7
  • 9
4
votes
1 answer

How to add a column to a PySpark dataframe which contains the nth quantile of another column in the dataframe

I have a very large CSV file which has been imported as a PySpark dataframe: df. The dataframe contains many columns including column ireturn. I want to compute the 0.99 and 0.01 percentile of this column and then add another column to the dataframe…
Monirrad
  • 465
  • 1
  • 7
  • 17
4
votes
1 answer

np.percentile returns different median from np.median when inf is present

When inf values are present in an array, under certain conditions np.percentile can return NaN as the median, whereas np.median can return a finite value. >>> import numpy as np >>> np.percentile([np.inf, 5, 4], [10, 20, 30, 40, 50, 60, 70, 80,…
Himanshu Poddar
  • 7,112
  • 10
  • 47
  • 93
4
votes
1 answer

How To Calculate Exact 99.9th Percentile in Splunk

Does anyone know how to exactly calculate the 99.9th percentile in Splunk? I have tried a variety of methods as below, such as exactperc (but this only takes integer percentiles) and perc (but this approximates the result heavily). base | stats…
user1763328
  • 301
  • 2
  • 3
  • 11
4
votes
2 answers

PhP/MySQL: how to dynamically change my (large and always changing) database

Scenario I have a MySQL database with 10.000 rows. Setup of the database: ID UniqueKey Name Url Score ItemValue 1 5Zvr3 Google google.com 13 X 2 46cfG Radio radio.com -20 X 3 2fg64 …
JuliusSecret
  • 149
  • 1
  • 2
  • 12