In statistics, a percentile (or centile) is the value of a variable below which a certain percent of observations fall.
Questions tagged [percentile]
739 questions
7
votes
3 answers
How can I return the numerical boxplot data of all results using 1 mySQL query?
[tbl_votes]
- id
- item_id
- vote
Of course we can fix this by getting:
the smallest observation (so)
the lower quartile (lq)
the median (me)
the upper…

Wouter Dorgelo
- 11,770
- 11
- 62
- 80
7
votes
1 answer
Prometheus latency graph in histogram and calculate percentile
I need to plot latency graph on prometheus by the histogram time-series, but I've been unsuccessful to display a histogram in grafana.
What I expect is to be able to show:
Y-axis is latency, x-axis is timeseries.
Each line representing the…

td4u
- 402
- 5
- 17
7
votes
1 answer
Calculate percentile on pyspark dataframe columns
I have a PySpark dataframe which contains an ID and then a couple of variables for which I want to calculate the 95% point.
Part of the printSchema():
root
|-- ID: string (nullable = true)
|-- MOU_G_EDUCATION_ADULT: double (nullable = false)
|--…

Wendy De Wit
- 293
- 2
- 3
- 6
7
votes
1 answer
Google BigQuery APPROX_QUANTILES and getting true quartiles
According to the docs:
Returns the approximate boundaries for a group of expression values, where number represents the number of quantiles to create. This function returns an array of number + 1 elements, where the first element is the approximate…

Tyler_1
- 176
- 1
- 2
- 11
7
votes
2 answers
How to calculate the mean of the top 10% in R
My dataset contains multiple observations for different species. Each species has a different number of observations. Looking for a fast way in R to calculate the mean of the top 10% of values for a given variable for each species.
I figured out how…

PGLS
- 71
- 1
- 5
7
votes
1 answer
Sort data before using numpy.median
I'm measuring the median and percentiles of a sample of data using Python.
import numpy as np
xmedian=np.median(data)
x25=np.percentile(data, 25)
x75=np.percentile(data, 75)
Do I have to use the np.sort() function on my data before measuring the…

Marika Blum
- 907
- 2
- 8
- 7
6
votes
2 answers
Calculating percentiles in SQL
This should be very straightforward, but as a newbie to SQL I am really struggling. I've been recommended to use PERCENTILE_CONT with continuous (non-discrete) data.
The data in question concerns two columns: (1) the IDs for a list of patients and…

user518206
- 93
- 1
- 1
- 7
6
votes
1 answer
Numpy percentiles with linear interpolation - wrong value?
The linear interpolation formula for percentiles is:
linear: i + (j - i) * fraction, where fraction is the fractional part of the index surrounded by i and j.
Suppose I have this list with 16 observations:
test = [0, 1, 5, 5, 5, 6, 6, 7, 7, 8, 11,…

jerbear
- 361
- 5
- 14
6
votes
1 answer
displaying the percentile distribution as a dataframe in python
I am trying to display the output of percentile distribution for each column as a dataframe as I want to export it to csv later.
I have simply looped all the columns like this :
for column in data:
…

Cagdas Kanar
- 713
- 4
- 13
- 23
6
votes
1 answer
Calculating Percentile in Python Pandas Dataframe
I'm trying to calculate the percentile of each number within a dataframe and add it to a new column called 'percentile'.
This is my attempt:
import pandas as pd
from scipy import stats
data =…

mattblack
- 1,370
- 3
- 13
- 19
6
votes
2 answers
Percentile calculator
I have been trying to create a small method to calculate given percentile from a seq. It works.. almost. Problem is I don't know why is doesn't work. I was hoping one of your 'a bit smarter' people than me could help me with it.
What I hope the…

Selena Hill
- 49
- 1
- 3
6
votes
2 answers
How to plot 95 percentile and 5 percentile on ggplot2 plot with already calculated values?
I have this dataset and use this R code:
library(reshape2)
library(ggplot2)
library(RGraphics)
library(gridExtra)
long <- read.csv("long.csv")
ix <- 1:14
ggp2 <- ggplot(long, aes(x = id, y = value, fill = type)) +
geom_bar(stat = "identity",…

Rlearner
- 351
- 1
- 4
- 13
6
votes
1 answer
Calculating Percentile Score for every value in the list
I've been searching for a way to calculate the percentile rank for every value in a given list and I've been unsuccessful thus far.
org.apache.commons.math3 gives you a way to fetch the pth percentile from a list of values but what I want is the…

LizardKing
- 165
- 1
- 2
- 8
6
votes
3 answers
Can percentiles of a set of data be calculated in a map-reduce manner?
My understanding is to calculate percentiles, the data needs to be sorted. Would this be possible with a huge amount of data spread across multiple servers, without moving it around?

marathon
- 7,881
- 17
- 74
- 137
5
votes
5 answers
How to calculate percentile with group by?
I have a data.table with over ten thousand of rows and it looks like this:
DT1 <- data.table(ID = 1:10,
result_2010 = c("TRUE", "FALSE", "TRUE", "FALSE", "FALSE", "TRUE", "FALSE", "FALSE", "TRUE", "FALSE"),
…

Besz15
- 157
- 7