1

I'm trying to use Snappydata 1.0.1 to read and process data from Hadoop (HDP 2.6.3).

When pointing to Hive metastore (via hive-site.xml in Snappydata config) Spark from Snappydata distribution can read list of databases, but cannot create table in Snappydata. It tells 'Table not found'. Moreover Snappydata cluster UI shows that table, but Snappydata cannot work with it further - INSERT, SELECT, DROP commands with this table throws table not found error, and subsequent CREATE TABLE tells 'Table already exists'.

Without specifying Hive metastore it works well.

Configuration in hive-site.xml:

<property>
  <name>hive.metastore.uris</name>
  <value>thrift://srv1.company.org:9083</value>
</property>

Also, we are using Smart Connector mode.

It seems very strange - pointing to Hive metastore breaks Snappydata code, which is completely unrelated to external Hive (we are neither reading nor writing from Hadoop yet).

There is hypothesis that Snappydata is incompatible with our Hive metastore version, and this incompatibility leads to strange behavior. Can someone clarify this issue?

Valentin P.
  • 1,131
  • 9
  • 18
  • +1 i have the same issue ,may be a SnappyStoreHiveCatalog bug in SmartConnectorMode . [https://github.com/SnappyDataInc/snappydata/issues/1072](https://github.com/SnappyDataInc/snappydata/issues/1072) – user5316398 Jul 08 '18 at 05:00
  • user5316398, I've just added solution – Valentin P. Jul 10 '18 at 09:13

1 Answers1

1

It seems that to read data from Hadoop (Hive, HDFS), we have to create exactly same external table in Snappydata. I.e having table A in Hadoop, we have to create table with same definition and EXTERNAL keyword in Snappydata to read data from Hadoop table A. This can be explained by existance of self metadata storage in Snappydata.

However, it is not clear from docs. And it's very pity that no one has answered to this question for almost two weeks.

Valentin P.
  • 1,131
  • 9
  • 18