0

This question is a possible duplicate of this one but the answers given are not satisfactory.

I 've ran the following simple code on Zeppelin : (Same scenario with pyspark CLI as well)

%spark2.pyspark
from pyspark.sql import HiveContext
sqlContext = HiveContext(sc)

df = sqlContext.read.format("csv").option("header", "false").option("mode", "DROPMALFORMED").load("/data/data1.csv")
df.write.mode('overwrite').saveAsTable("default.hive_spark");

Then :

%spark2.pyspark
sqlDF = spark.sql("show tables")
sqlDF.show()

It shows :

+--------+----------------+-----------+
|database|       tableName|isTemporary|
+--------+----------------+-----------+
| default|      hive_spark|      false|
+--------+----------------+-----------+

But when I login to the HIVE CLI (user:hive) this table does not show up :

0: jdbc:hive2://ip-xxx.eu-west-3.com>USE default;
0: jdbc:hive2://ip-xxx.eu-west-3.com>SHOW TABLES;

+-----------+
| tab_name  |
+-----------+
| hive_test |
+-----------+

I tried

sqlContext.uncacheTable("default.hive_spark")

I am confused.

Mehdi LAMRANI
  • 11,289
  • 14
  • 88
  • 130

1 Answers1

0

Use HiveWareHouseConnector since you are working with hive2

Bishamon Ten
  • 479
  • 1
  • 6
  • 20