I ran into a Hive query calculating a count distinct
without grouping, which runs very slow. So I was wondering how is this functionality implemented in Hive, is there a UDAFCountDistinct
for this?
Asked
Active
Viewed 116 times
1

leftjoin
- 36,950
- 8
- 57
- 116
1 Answers
1
Hive 1.2.0+ provides auto-rewrite optimization for count(distinct). Check this setting:
hive.optimize.distinct.rewrite=true;

leftjoin
- 36,950
- 8
- 57
- 116