Questions tagged [percentile]

In statistics, a percentile (or centile) is the value of a variable below which a certain percent of observations fall.

A closely related concept is "quantile". See .

739 questions
7
votes
3 answers

How can I return the numerical boxplot data of all results using 1 mySQL query?

[tbl_votes] - id - item_id - vote Of course we can fix this by getting: the smallest observation (so) the lower quartile (lq) the median (me) the upper…
Wouter Dorgelo
  • 11,770
  • 11
  • 62
  • 80
7
votes
1 answer

Prometheus latency graph in histogram and calculate percentile

I need to plot latency graph on prometheus by the histogram time-series, but I've been unsuccessful to display a histogram in grafana. What I expect is to be able to show: Y-axis is latency, x-axis is timeseries. Each line representing the…
td4u
  • 402
  • 5
  • 17
7
votes
1 answer

Calculate percentile on pyspark dataframe columns

I have a PySpark dataframe which contains an ID and then a couple of variables for which I want to calculate the 95% point. Part of the printSchema(): root |-- ID: string (nullable = true) |-- MOU_G_EDUCATION_ADULT: double (nullable = false) |--…
Wendy De Wit
  • 293
  • 2
  • 3
  • 6
7
votes
1 answer

Google BigQuery APPROX_QUANTILES and getting true quartiles

According to the docs: Returns the approximate boundaries for a group of expression values, where number represents the number of quantiles to create. This function returns an array of number + 1 elements, where the first element is the approximate…
Tyler_1
  • 176
  • 1
  • 2
  • 11
7
votes
2 answers

How to calculate the mean of the top 10% in R

My dataset contains multiple observations for different species. Each species has a different number of observations. Looking for a fast way in R to calculate the mean of the top 10% of values for a given variable for each species. I figured out how…
PGLS
  • 71
  • 1
  • 5
7
votes
1 answer

Sort data before using numpy.median

I'm measuring the median and percentiles of a sample of data using Python. import numpy as np xmedian=np.median(data) x25=np.percentile(data, 25) x75=np.percentile(data, 75) Do I have to use the np.sort() function on my data before measuring the…
Marika Blum
  • 907
  • 2
  • 8
  • 7
6
votes
2 answers

Calculating percentiles in SQL

This should be very straightforward, but as a newbie to SQL I am really struggling. I've been recommended to use PERCENTILE_CONT with continuous (non-discrete) data. The data in question concerns two columns: (1) the IDs for a list of patients and…
user518206
  • 93
  • 1
  • 1
  • 7
6
votes
1 answer

Numpy percentiles with linear interpolation - wrong value?

The linear interpolation formula for percentiles is: linear: i + (j - i) * fraction, where fraction is the fractional part of the index surrounded by i and j. Suppose I have this list with 16 observations: test = [0, 1, 5, 5, 5, 6, 6, 7, 7, 8, 11,…
jerbear
  • 361
  • 5
  • 14
6
votes
1 answer

displaying the percentile distribution as a dataframe in python

I am trying to display the output of percentile distribution for each column as a dataframe as I want to export it to csv later. I have simply looped all the columns like this : for column in data: …
Cagdas Kanar
  • 713
  • 4
  • 13
  • 23
6
votes
1 answer

Calculating Percentile in Python Pandas Dataframe

I'm trying to calculate the percentile of each number within a dataframe and add it to a new column called 'percentile'. This is my attempt: import pandas as pd from scipy import stats data =…
mattblack
  • 1,370
  • 3
  • 13
  • 19
6
votes
2 answers

Percentile calculator

I have been trying to create a small method to calculate given percentile from a seq. It works.. almost. Problem is I don't know why is doesn't work. I was hoping one of your 'a bit smarter' people than me could help me with it. What I hope the…
Selena Hill
  • 49
  • 1
  • 3
6
votes
2 answers

How to plot 95 percentile and 5 percentile on ggplot2 plot with already calculated values?

I have this dataset and use this R code: library(reshape2) library(ggplot2) library(RGraphics) library(gridExtra) long <- read.csv("long.csv") ix <- 1:14 ggp2 <- ggplot(long, aes(x = id, y = value, fill = type)) + geom_bar(stat = "identity",…
Rlearner
  • 351
  • 1
  • 4
  • 13
6
votes
1 answer

Calculating Percentile Score for every value in the list

I've been searching for a way to calculate the percentile rank for every value in a given list and I've been unsuccessful thus far. org.apache.commons.math3 gives you a way to fetch the pth percentile from a list of values but what I want is the…
LizardKing
  • 165
  • 1
  • 2
  • 8
6
votes
3 answers

Can percentiles of a set of data be calculated in a map-reduce manner?

My understanding is to calculate percentiles, the data needs to be sorted. Would this be possible with a huge amount of data spread across multiple servers, without moving it around?
marathon
  • 7,881
  • 17
  • 74
  • 137
5
votes
5 answers

How to calculate percentile with group by?

I have a data.table with over ten thousand of rows and it looks like this: DT1 <- data.table(ID = 1:10, result_2010 = c("TRUE", "FALSE", "TRUE", "FALSE", "FALSE", "TRUE", "FALSE", "FALSE", "TRUE", "FALSE"), …
Besz15
  • 157
  • 7