In statistics, a percentile (or centile) is the value of a variable below which a certain percent of observations fall.
Questions tagged [percentile]
739 questions
2
votes
2 answers
functools: computing inter quartile range
I use functools to compute percentiles this way:
import functools
percentiles = tuple(functools.partial(np.percentile, q=q) for q in (75, 85, 95))
percentiles
(functools.partial(, q=75),
…
user13641081
2
votes
2 answers
Calculating percentile for each gridpoint in xarray
I am currently using xarray to make probability maps. I want to use a statistical assessment like a “counting” exercise. Meaning, for all data points in NEU count how many times both variables jointly exceed their threshold. That means 1th…

Rienmachien
- 23
- 1
- 3
2
votes
2 answers
Finding Percentile in Spark-Scala per a group
I am trying to do a percentile over a column using a Window function as below. I have referred here to use the ApproxQuantile definition over a group.
val df1 = Seq(
(1, 10.0), (1, 20.0), (1, 40.6), (1, 15.6), (1, 17.6), (1, 25.6),
(1,…

abc_spark
- 383
- 3
- 19
2
votes
1 answer
Rank computation considering time stamp in grouped data
In my game dataset, I have observations for several game players for several points in time. For each observation, I want to compute a rank for this player based on the number of points compared to the number of points of other players at this point…

Scijens
- 541
- 2
- 11
2
votes
1 answer
Percentile calculation in HIVE
How can I calculate 25 percentile in Hive using sql. Let's say there is category, sub category and sales column. So how can I calculate the 25 percentile of sales? I tried to use the percentile(sales, 0.25) in hive but it is throwing an…

Karan6787
- 21
- 1
- 2
2
votes
2 answers
GCP Console: How are percentile charts calculated?
I do not understand how the charts that show percentiles are calculated inside the Google Cloud Platform Monitoring UI.
Here is how I am creating the standard chart:
Example log events
Creating a log-based metric for request…

zino
- 1,222
- 2
- 17
- 47
2
votes
5 answers
Using Numpy, how 25 percentile is calculate for number 1 to10?
from numpy import percentile
import numpy as np
data=np.array([1,2,3,4,5,6,7,8,9,10])
# calculate quartiles
quartile_1 = percentile(data, 25)
quartile_3 =percentile(data, 75)
# calculate min/max
print(quartile_1) # show 3.25
print(quartile_3) #…

The DataScience
- 53
- 1
- 6
2
votes
1 answer
Compute rolling percentiles in PySpark
I have a dataframe with dates, ID (let's say of a city) and two columns of temperatures (in my real dataframe I have a dozen of columns to compute).
I want to "rank" those temperatures for a given window. I want this ranking to be scaled from 0 (the…

virgilus
- 141
- 1
- 11
2
votes
1 answer
How implement SAS percentile statement into R?
I have such SAS statement:
proc univariate data = df noprint;
class &var1. &var2.;
var &var3.;
output out = STAT PCTLPTS = 2 5 98 99 95 PCTLPRE = P;
I have output from SAS proc like this:
How can I get the same result in R? (with 5 P-columns and…

red_quark
- 971
- 5
- 20
2
votes
2 answers
Calculate percentiles ignoring missing values
I have a PySpark dataframe with columns ID and BALANCE.
I am trying to bucket the column balance into 100 percentile (1-100%) buckets and calculate how many IDs fall in each bucket.
I cannot use anything related to RDD, I can only use PySpark…

Ninjia718
- 21
- 3
2
votes
2 answers
Plot a histogram, based on percentiles
I have a frame with the folowing structure:
df = pd.DataFrame({'ID': np.random.randint(1, 13, size=1000),
'VALUE': np.random.randint(0, 300, size=1000)})
How could i plot the graph, where on the X-axis there will be percentiles…

Denis Ka
- 137
- 1
- 1
- 10
2
votes
2 answers
Is it possible to get the PERCENT_RANK for a single record, but relative to the entire table?
I would like the PERCENT_RANK value for a single record, but in relation to the entire table. Is this possible?
Examples I've seen are like this:
SELECT Name, Salary
PERCENT_RANK() OVER (ORDER BY Salary)
FROM Employees
Notice that it's…

Deane
- 8,269
- 12
- 58
- 108
2
votes
3 answers
Get percentiles from a grouped dataframe
I have a dataframe that has 2 experiment groups and I am trying to get percentile distributions. However, the data is already grouped:
df = pd.DataFrame({'group': ['control', 'control', 'control','treatment','treatment','treatment'],
…

Utopia025
- 1,181
- 3
- 11
- 21
2
votes
1 answer
Understanding numpy percentile computation
I understand percentile in the context of test scores with many examples (eg. you SAT score falls in the 99th percentile), but I am not sure I understand percentile in the following context and what is going on. Imagine a model outputs probabilities…

Jane Sully
- 3,137
- 10
- 48
- 87
2
votes
2 answers
Calculate percentile with groupBy on PySpark dataframe
I am trying to groupBy and then calculate percentile on PySpark dataframe. I've tested the following piece of code according to this Stack Overflow post:
from pyspark.sql.types import FloatType
import pyspark.sql.functions as func
import numpy as…

Marc S
- 97
- 2
- 11