1

I ran into a Hive query calculating a count distinct without grouping, which runs very slow. So I was wondering how is this functionality implemented in Hive, is there a UDAFCountDistinct for this?

leftjoin
  • 36,950
  • 8
  • 57
  • 116

1 Answers1

1

Hive 1.2.0+ provides auto-rewrite optimization for count(distinct). Check this setting:

hive.optimize.distinct.rewrite=true;
leftjoin
  • 36,950
  • 8
  • 57
  • 116