0

I am trying to get quantiles from my data- Lets say (0.25, 0.5 and 0.75 ) for a variable . I could figure out how to do it from a dataframe

Quantile <- mydataframe %>% summarize(quantile(variable_name,c(.25,0.5,0.75)))

However instead of a dataframe, when I try to get the quantiles from a table (after connecting to greenplum) it doesn't seem to work .

Any ideas on how to implement this? My guess is the translation to PostgreSQL query for quantile function is not available .

sourav
  • 179
  • 1
  • 1
  • 14

1 Answers1

0

In order to get quantiles out of postgreSQL you use the percentile_cont function.

percentile_cont(fraction) WITHIN GROUP (ORDER BY sort_expression)

http://www.postgresql.org/docs/9.4/static/functions-aggregate.html

PabTorre
  • 2,878
  • 21
  • 30
  • Thanks for the response. I have used percentile_cont to get quantiles from greenplum . However , in this case I am connecting to Greenplum through R and writing R functions for data manipulation. Is there a way to do what percentile _cont does from R ? Can dpylr do this in anyway ? – sourav Sep 05 '15 at 04:20
  • that function gets the percentiles from the DB side... To do the same on the R side maybe you could use R's quantile function? http://www.r-tutor.com/elementary-statistics/numerical-measures/percentile – PabTorre Sep 05 '15 at 05:07
  • The quantile function works well with dataframes . It is giving me errors when I am working directly with the database (and not creating a data frame) – sourav Sep 05 '15 at 17:45
  • Setup a sample database in fiddle that shows the problem, and maybe there we can nail it down. :) – PabTorre Sep 06 '15 at 02:29