0

I have this simple query which is fine in hive 0.8 in IBM BigInsights2.0:

SELECT * FROM patient WHERE hr > 50 LIMIT 5

However when I run this query using hive 0.12 in BigInsights3.0 it runs forever and returns no results. Actually the scenario is the same for following query and many others:

INSERT OVERWRITE DIRECTORY '/Hospitals/dir' SELECT p.patient_id FROM
   patient1 p WHERE p.readingdate='2014-07-17'

If I exclude the WHERE part then it would be all fine in both versions.

Any idea what might be wrong with hive 0.12 or BigInsights3.0 when including WHERE clause in the query?

Henaras
  • 21
  • 4

1 Answers1

0

When you use a WHERE clause in the Hive query, Hive will run a map-reduce job to return the results. That's why it usually takes longer to run the query because without the WHERE clause, Hive can simply return the content of the file that represents the table in HDFS. You should check the status of the map-reduce job that is triggered by your query to find out if an error happened. You can do that by going to the Application Status tab in the BigInsights web console and clicking on Jobs, or by going to the job tracker web interface. If you see any failed tasks for that job, check the logs of the particular task to find out what error occurred. After fixing the problem, run the query again.

Thomas
  • 1