I have a table called "transaction" in Hive which is partitioned on a column called "DS" which will have data like "2018-05-05", "2018-05-09", "2018-05-10" and so on
This table is populated overnight for the day which got completed. At any point, the table will have data till previous day
When I query trasaction table like
SELECT COUNT(*) FROM trasaction WHERE DS >= "current date";
I get
0 rows - which is correct because data has not been loaded yet for current and future dates
When I run below query
SELECT DISTINCT DS FROM trasaction WHERE DS >= "current date";
I get
2018-05-09
2018-05-10
2018-05-11
2018-05-12
2018-05-13
...
2018-08-30
I have checked the HDFS location and partition folders already exists for future dates as well. I am guessing the people who do ETL, have already added partition for future dates using
ALTER TABLE trasaction ADD PARTITION (DS = '2018-05-13')
and so on
Am I doing wrong here? Even if the output of both queries are correct, what is the reason for above output?