0

I would like to fiddle a bit around with Apache Flink and Apache Iceberg and test this on a local machine. I read through the documentation, but I'm still not sure what has to be setup locally to make this run. What I already did is that I have a docker-compose file to start locally a hadoop-namenode and -datanode and a hive-server which stores the metadata in Postgres.

Additionally I setup a local Flink project (Java project with Scala 2.12.) in my IDE and besides of the default Flink dependencies, I added the flink-clients, flink-table-api-java-bridge, flink-table-planner, flink-connector-hive, hive-exec, hadoop-client with version 2.8.3, the flink-hadoop-compatibility and also the iceberg-flink-runtime-1.14 dependencies.

I'm then trying to create a simple catalog with a flink SQL statement like this:

tEnv.executeSql(String.join("\n",
                "CREATE CATALOG iceberg_catalog WITH (",
              "'type'='iceberg', ",
              "'catalog-type'='hive', ",
              "'uri'='thrift://localhost:9083', ",
              "'warehouse'='hdfs://namenode:8020/warehouse/path')"));

Afterwards I'm getting the following warnings and stack trace:

12:11:43,869 WARN  org.apache.flink.runtime.util.HadoopUtils                    [] - Could not find Hadoop configuration via any of the supported methods (Flink configuration, environment variables).
12:11:44,203 INFO  org.apache.hadoop.hive.conf.HiveConf                         [] - Found configuration file null
12:11:44,607 WARN  org.apache.hadoop.util.NativeCodeLoader                      [] - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
12:11:44,816 ERROR org.apache.hadoop.hive.metastore.utils.MetaStoreUtils        [] - Got exception: java.lang.ClassCastException class [Ljava.lang.Object; cannot be cast to class [Ljava.net.URI; ([Ljava.lang.Object; and [Ljava.net.URI; are in module java.base of loader 'bootstrap')
java.lang.ClassCastException: class [Ljava.lang.Object; cannot be cast to class [Ljava.net.URI; ([Ljava.lang.Object; and [Ljava.net.URI; are in module java.base of loader 'bootstrap')
    at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.resolveUris(HiveMetaStoreClient.java:262) [hive-exec-3.1.2.jar:3.1.2]
    at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:182) [hive-exec-3.1.2.jar:3.1.2]

I read through the documentation, but I'm not sure what is necessary to run all of this locally out of the IDE (and not inside a dedicated Flink cluster, with the dependencies added via the libs etc.).

It would be great if you could give me a hint what I'm missing here or doing wrong.

Lothium
  • 49
  • 7
  • Ok, this can be solved by using java 8 instead of 11, but I'm still not sure how I can connect to a Hive metastore, because the CREATE CATALOG command doesn't seem to create anything in the Hive metastore, running in a docker container. – Lothium Mar 16 '22 at 21:04

1 Answers1

0

Note that the CATALOG represents the iceberg table's directory and is not part of Hive. When you create a catalog, it does not leave anything in Hive metastore.

But when you use Iceberg Flink SQL such as "Create database iceberg_db" to create a database in this hive catalog, you'll see it in hive metastore as well.

In the same way, when you create a table using hive Catalog, if you look at it using hive desc formatted, you'll find a table property named "table_type" with the value "ICEBERG".

liliwei
  • 294
  • 1
  • 8
  • Awesome, thanks for the explanation, that helped already! I was now able to solve this problem, but now I get a `Connection refused` (and also `Failed to create file: hdfs://namenode:8020/user/hive/warehouse/flink_table/metadata/00000-bcec8744-7824-4fd9-b95b-cc6988a68921.metadata.json`) error. Do you have an example how to setup Hive locally via Docker, so that it is accessible from the local machine? If not I would create this as a new topic. – Lothium Mar 30 '22 at 15:04