0

Im trying Pivotal Hawq with ambari and now im trying to run some queries over hive tables with hawq.

From what i have seen Hawq can query hive tables through HCatalog (https://community.hortonworks.com/articles/43264/hawqhdb-and-hadoop-with-hive-and-hbase.html ), and so, i use psql tool on the comand line to run queries like this:

SELECT * FROM hcatalog.hive-db-name.hive-table-name;

Previously i run some queries on Hive to compare results with Hawq, i was expecting hawq to be much faster, but hawq its being much more slow, the query response is much more long than in Hive.

Can someone explain why is this happening?

  • create table hawq_table as SELECT * FROM hcatalog.hive-db-name.hive-table-name distributed randomly; Now execute your query. PXF has to go through Hive to get the data so Hive is still the bottleneck. After you create the table in in HAWQ, you will bypass slow Hive and things will be much faster. – Jon Roberts Jul 07 '17 at 12:13
  • @JonRoberts thanks for your answer, i was able to build and query with the creation of tables. Previously i was able to build the tables like you said on pgsql, but now i have the following error: remote component error java.lang.NoClassDefFoundError: org/apache/hadoop/hive/ql/io/orc/OrcInputFormat (libchurl.c:897) – Mário Rodrigues Jul 12 '17 at 15:41

0 Answers0