0

I am trying to create Iceberg(0.11.1) formatted tables in Hive 3.1.1 using Pyspark 3.0.2 but getting below errors and warnings.

Any help will be greatly appreciated. Let me know if I need to add any more details.

Code to create table:

spark.sql(" CREATE TABLE delta.global_spark_test(id bigint) USING ICEBERG tblproperties ( 'transactional'='true','ENGINE_HIVE_ENABLED'='true', 'write.format.default'='orc' ) ")

Logs

WARN hive.HiveExternalCatalog: Couldn't find corresponding Hive SerDe for data source provider ICEBERG. Persisting data source table `dfp`.`global_spark_test` into Hive metastore in Spark SQL specific format, which is NOT compatible with Hive.

22/10/19 01:32:40 INFO sqlstd.SQLStdHiveAccessController: Created SQLStdHiveAccessController for session context : HiveAuthzSessionContext [sessionString=18ef49c2-e4ff-4cf7-aea6-c133daa97b3c, clientType=HIVECLI]

22/10/19 01:32:40 WARN session.SessionState: METASTORE_FILTER_HOOK will be ignored, since hive.security.authorization.manager is set to instance of HiveAuthorizerFactory.

22/10/19 01:32:40 INFO hive.metastore: Connected to metastore.

Traceback (most recent call last):

  File "/home/svc-dm-etl/datalake/bin/iceberg_config.py", line 48, in <module>

    spark.sql(" CREATE TABLE delta.global_spark_test(id bigint) USING ICEBERG tblproperties ( 'transactional'='true','ENGINE_HIVE_ENABLED'='true', 'write.format.default'='orc' ) ")

  File "/app/spark3.0.2/python/lib/pyspark.zip/pyspark/sql/session.py", line 649, in sql

  File "/app/spark3.0.2/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1305, in __call__

  File "/app/spark3.0.2/python/lib/pyspark.zip/pyspark/sql/utils.py", line 134, in deco

File "/app/spark3.0.2/python/lib/pyspark.zip/pyspark/sql/utils.py", line 134, in deco

  File "<string>", line 3, in raise_from

pyspark.sql.utils.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:The table must be stored using an ACID compliant format (such as ORC): delta.global_spark_test);

Iceberg configurations are below:

###IceBergConfigurations

                        .config("spark.sql.extensions","org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions")

                        #.config("spark.sql.catalog.hive_dev","org.apache.iceberg.spark.SparkSessionCatalog")

                        .config("spark.sql.catalog.hive_dev","org.apache.iceberg.spark.SparkCatalog")

                        .config("spark.sql.catalog.hive_dev.type","hive")

                        #.config("spark.sql.catalog.hive_dev.uri","thrift:URL")

                        .config("spark.sql.catalog.hive_dev.uri","thrift:URL")

                        #.config("spark.sql.catalog.spark_catalog","org.apache.iceberg.spark.SparkSessionCatalog")

                        .config("spark.sql.catalog.spark_catalog.type","hive")
Atif
  • 2,011
  • 9
  • 23

0 Answers0