0

We have the latest Hortonworks's HDP, with Hive version (3.1.0)

I have a problem when trying to count the number of rows, on a given condition. The count (*) returns false value when executed side by side with a simple select on the same conditions

Example :

select *
from mydata
where product = "157536" and
      date = "2019-03-05";

=> gives 34 rows

select count(*)
from mydata
where product = "157536" and
      date = "2019-03-05";

=> gives a count of 9

After looking up on the net: i've tried

ANALYZE TABLE mydata COMPUTE STATISTICS; (before the count but to no avail)

also tried a repair table

also tried to play with these to params : hive.stats.autogather and hive.compute.query.using.stats => but nothing

Additional info : Hive is running with Tez

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • 2
    Perhaps the table is changing between the times when you run the query. Or, you are querying different databases. – Gordon Linoff Jun 25 '19 at 13:15
  • I am certain that i have only one database with one table (in my dev envirement). and the data doesn't move, i've been working on this two days straight with always same results – Ichlibitiche Jun 25 '19 at 13:19
  • Can you check these parameters - 'numFiles' and 'numRows' by running "SHOW CREATE TABLE mydata" in Hive? You can try to compare these values with the number of files present in HDFS for this table. – Gomz Jun 27 '19 at 11:59

0 Answers0