2

How can I calculate 25 percentile in Hive using sql. Let's say there is category, sub category and sales column. So how can I calculate the 25 percentile of sales? I tried to use the percentile(sales, 0.25) in hive but it is throwing an error:

Error while compiling statement: FAILED: NoMatchingMethodException No matching method for class org.apache.hadoop.hive.ql.udf.UDAFPercentile with (double, decimal(2,2)). Possible choices: FUNC(bigint, array) FUNC(bigint, double)

leftjoin
  • 36,950
  • 8
  • 57
  • 116
Karan6787
  • 21
  • 1
  • 2

1 Answers1

3

Documentation says:

A true percentile can only be computed for integer values. Use PERCENTILE_APPROX if your input is non-integral.

Use percentile_approx for non-integral values. percentile_approx(DOUBLE col, p [, B]) - Returns an approximate pth percentile of a numeric column (including floating point types) in the group. The B parameter controls approximation accuracy at the cost of memory. Higher values yield better approximations, and the default is 10,000. When the number of distinct values in col is smaller than B, this gives an exact percentile value.

leftjoin
  • 36,950
  • 8
  • 57
  • 116