0

I would like to get the 20th percentile of a column in big query using dplyr syntax in bigrquery, but I keep getting the following errors. Here is a reproducible example:

library(bigrquery)
library(dplyr)
library(DBI)
billing <- YOUR_BILLING_INFO

con <- dbConnect(
  bigrquery::bigquery(),
  project = "publicdata",
  dataset = "samples",
  billing = billing
)

natality <- tbl(con, "natality")

natality %>%
  filter(year %in% c(1969, 1970)) %>%
  group_by(year) %>%
  summarise(percentile_20 = percentile_cont(weight_pounds, 0.2))

I get the following error:

Error: Analytic function PERCENTILE_CONT cannot be called without an OVER clause at [1:16] [invalidQuery]

However, it is not clear how to include an OVER clause here. How can I get the 20th percentile with dplyr syntax?

pogibas
  • 27,303
  • 19
  • 84
  • 117
alex56
  • 11
  • 2
  • since you are trying to write a query - could you post an example outside of dplyr? would be less complicated to look at and reproduce - try it on http://console.cloud.google.com/bigquery – Felipe Hoffa Jul 26 '18 at 05:00
  • It looks like windows functions are not supported when using dplyr. https://github.com/tidyverse/dplyr/issues/2290 – AlienDeg Jul 26 '18 at 22:53

0 Answers0