I would like to get the 20th percentile of a column in big query using dplyr syntax in bigrquery, but I keep getting the following errors. Here is a reproducible example:
library(bigrquery)
library(dplyr)
library(DBI)
billing <- YOUR_BILLING_INFO
con <- dbConnect(
bigrquery::bigquery(),
project = "publicdata",
dataset = "samples",
billing = billing
)
natality <- tbl(con, "natality")
natality %>%
filter(year %in% c(1969, 1970)) %>%
group_by(year) %>%
summarise(percentile_20 = percentile_cont(weight_pounds, 0.2))
I get the following error:
Error: Analytic function PERCENTILE_CONT cannot be called without an OVER clause at [1:16] [invalidQuery]
However, it is not clear how to include an OVER clause here. How can I get the 20th percentile with dplyr syntax?