everyone,in hive ,we use
select word,count(*) as cnt from table group by word order by cnt limit N
for top-N query.
As we kown the speed is not fast,i learn about some approximate algorithm for top-k query ,such as countsketch algorithm or another algorithm.
Could we add approximate algorithm to hive for speed up top-k query?