1

I want to use apache flink on a secure kerberized HDP 3.1 cluster, but am still stuck with the first steps.

The latest release was downloaded and unzipped (https://flink.apache.org/downloads.html#apache-flink-1101)

Now, I try to follow https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/hive/

To integrate with Hive, you need to add some extra dependencies to the /lib/ directory in Flink distribution to make the integration work in Table API program or SQL in SQL Client. Alternatively, you can put these dependencies in a dedicated folder, and add them to classpath with the -C or -l option

Due to the HDP environment:

How can I:

  1. tell flink to use / load these to the classpath
  2. is it somehow possible to omit the first configuration steps (nae, defaultdb, confdir, version) and somehow inferr the atomatically from hive-site.xml?
  3. start an interactive shell (similarly to a spark-shell, i.e. like flinks interactive sql shell but scala based) in order to follow along with the following steps outlined in the link.

The code

val settings = EnvironmentSettings.newInstance().useBlinkPlanner().inBatchMode().build()
val tableEnv = TableEnvironment.create(settings)

val name            = "myhive"
val defaultDatabase = "mydatabase"
val hiveConfDir     = "/opt/hive-conf" // a local path
val version         = "2.3.4"

val hive = new HiveCatalog(name, defaultDatabase, hiveConfDir, version)
tableEnv.registerCatalog("myhive", hive)

// set the HiveCatalog as the current catalog of the session
tableEnv.useCatalog("myhive")

edit

presumably, the following is needed to find the hadoop configuration:

export HADOOP_CONF_DIR=/usr/hdp/current/spark2-client/conf

As well as: https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/deployment/hadoop.html I.e.:

export HADOOP_CLASSPATH=$(hadoop classpath)

For now, I still fail to start a flink shell even without any hive support:

cd /path/to/flink-1.10.1/bin
./start-scala-shell.sh
Error: Could not find or load main class org.apache.flink.api.scala.FlinkShell

this preliminary problem seems to be fixable by switching to the older 2.11 version Flink 1.7.2 start-scala-shell.sh cannot find or load main class org.apache.flink.api.scala.FlinkShell

./start-scala-shell.sh local

already works for me to start a local shell.

./start-scala-shell.sh yarn

starts something (locally), but no yarn container is launched.

Meanwhile I have set:

catalogs:
   - name: myhive
     type: hive
     hive-conf-dir: /usr/hdp/current/hive-server2/conf
     hive-version: 3.1.2

in the local flink configuration. It is still unclear for me if simply specifying the environment variables mentioned above should make it work automatically.

However, for me the code does not compile as env is not loaded:

scala> env
<console>:68: error: not found: value env

but trying to manually specify

import org.apache.flink.api.scala._
import org.apache.flink.table.api._
import org.apache.flink.table.api.scala._

// environment configuration
val env = ExecutionEnvironment.getExecutionEnvironment
val tEnv = BatchTableEnvironment.create(env)

fails as well with

.UnsupportedOperationException: Execution Environment is already defined for this shell.

edit 2

With:

cd /path/to/flink-1.10.1
export HADOOP_CONF_DIR=/usr/hdp/current/spark2-client/conf
export HADOOP_CLASSPATH=$(hadoop classpath)

./bin/yarn-session.sh --queue my_queue -n 1 -jm 768 -tm 1024

I can successfully start a minimalstic flink cluster on YARN (without the ambari service). Though it would make sense to install the ambari integration.

For now, I could not yet test if / how the interaction with the kerberized Hive and HDFS works. Also, for now, I fail to start an interactive shell - as outlined below.

In fact, even in a playground non kerberized enviornment I observe issues with flinks interactive shell flink start scala shell - numberformat exepction

edit 3

I do not know what changed but with:

cd /home/at/heilerg/development/software/flink-1.10.1
export HADOOP_CONF_DIR=/usr/hdp/current/spark2-client/conf
export HADOOP_CLASSPATH=$(hadoop classpath)

./bin/start-scala-shell.sh local

btenv.listDatabases
//res12: Array[String] = Array(default_database)

btenv.listTables
//res9: Array[String] = Array()

I can get hold of a batch table environment in local mode. Currently, no tables or databases from hive are present though.

NOTE: the configuration is set up as follows:

catalogs:
   - name: hdphive
     type: hive
     hive-conf-dir: /usr/hdp/current/hive-server2/conf
     hive-version: 3.1.2

When instead trying to use code over configuration, I cannot import the HiveCatalog:

val name            = "hdphive"
val defaultDatabase = "default"
val hiveConfDir     = "/usr/hdp/current/hive-server2/conf" // a local path
val version         = "3.1.2" //"2.3.4"


import org.apache.flink.table.catalog.hive.HiveCatalog
// for now, I am failing here
// <console>:67: error: object hive is not a member of package org.apache.flink.table.catalog
//       import org.apache.flink.table.catalog.hive.HiveCatalog

val hive = new HiveCatalog(name, defaultDatabase, hiveConfDir, version)
tableEnv.registerCatalog(name, hive)
tableEnv.useCatalog(name)

btenv.listDatabases

These jars were manually put into the lib directory:

Regardless of the version of hive's jars I face missing Hive classes:

val version         = "3.1.2" // or "3.1.0" // has the same problem
import org.apache.flink.table.catalog.hive.HiveCatalog
val hive = new HiveCatalog(name, defaultDatabase, hiveConfDir, version)
java.lang.NoClassDefFoundError: org/apache/hadoop/hive/metastore/api/NoSuchObjectException
  ... 30 elided
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.metastore.api.NoSuchObjectException
  at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
  at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
  ... 30 more

But ins't: export HADOOP_CLASSPATH=$(hadoop classpath) used to load the HDP classes?

Anyways:

cp /usr/hdp/current/hive-client/lib/hive-exec-3.1.0.<<<version>>>.jar /path/to/flink-1.10.1/lib

means that I am one step further:

val hive = new HiveCatalog(name, defaultDatabase, hiveConfDir, version)
btenv.registerCatalog(name, hive)
Caused by: java.lang.ClassNotFoundException: com.facebook.fb303.FacebookService$Iface

After adding https://repo1.maven.org/maven2/org/apache/thrift/libfb303/0.9.3/ into the lib directory

btenv.registerCatalog(name, hive)

Doesn't complain with a class not found exception, but execution seems to be stuck at this step for several minutes. Then, it fails with a kerberos exception:

Caused by: org.apache.hadoop.hive.metastore.api.MetaException: Could not connect to meta store using any of the URIs provided. Most recent failure: org.apache.thrift.transport.TTransportException: GSS initiate failed

I just realized that:

<property>
      <name>hive.metastore.kerberos.principal</name>
      <value>hive/_HOST@xxxx</value>
</property>

and

klist
Default principal: user@xxxx

That here the principal would not match the one from hive-site.xml. However, spark can read the metastore just fine with the same configuration and principal mismatch just outlined here.

Georg Heiler
  • 16,916
  • 36
  • 162
  • 292
  • Georg, this is an advanced Flink scenario just because of the kerberization and integration with non HDP service. If you have HDP Support, this is right in line with a support ticket to get at contributor level resources. If you do not have support, and you are trying to DIY, you may need to go directly at Flink contributor channels, jira, or asf slack. I am not even sure you can get this kind of support in the Cloudera Community and many of the big guys are here too. Our DFHZ teams are currently working a HDP3 Flink for ambari, but unfortunately the kerberized use case is last on list. – steven-matison Jun 12 '20 at 22:12
  • I have linked this here to the mailing list: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Running-Flink-on-kerberized-HDP-3-1-minimal-getting-started-td35925.html – Georg Heiler Jun 13 '20 at 07:57
  • Excellent! I was just successful getting Flink 1.10 installed in HDP3 on centos7. When this is done a Flink YARN app is created with the jar file locations in environment variables. It's a huge string of paths and jars which I can't put here in a comment. I think this is the answer to your Question 1. – steven-matison Jun 13 '20 at 14:52
  • 1
    Maybe you could share a gist or flink mailing list – Georg Heiler Jun 13 '20 at 14:53
  • https://gist.github.com/steven-dfheinz/6617c91837ceef543f8e8fe1f370ab9b – steven-matison Jun 13 '20 at 15:08
  • How did you construct this long string of JARs? – Georg Heiler Jun 13 '20 at 16:38
  • Ambari did it... my demo Flink is installed as a service, single node test cluster. – steven-matison Jun 13 '20 at 16:51
  • Which script were you using to create the ambari service? https://github.com/abajwa-hw/ambari-flink-service? – Georg Heiler Jun 14 '20 at 05:32
  • https://community.cloudera.com/t5/Community-Articles/Exploring-Apache-Flink-with-HDP/ta-p/244609 has some introductory comments with regards to the link I just shared. – Georg Heiler Jun 14 '20 at 07:41
  • @steven-dfheinz, please see the edit above. I can get a minimalistic cluster ro run i.e. also in kerberized YARN. This is already great. However, a couple of questions are still open, namely 1) how to use an interactive shell like spark-submit (see above with the Exceptions where I currently fail 2) how to get interaction with HDFS and Hive working. I would expect some jaas.conf fiddeling around here. – Georg Heiler Jun 14 '20 at 07:51
  • Yes that is the service for HDP 2.6.5. I just had one minor issue with it in HDP3... https://github.com/abajwa-hw/ambari-flink-service/issues/19 – steven-matison Jun 15 '20 at 14:48
  • @steven-dfheinz can you view the tables already defined in hive? For me I only get an emty list of databases returned, i.e. the existing hive metastore configuration is not loaded. – Georg Heiler Jun 17 '20 at 08:52
  • @steven-dfheinz did you figure out kerberos and flink with Hive interoperability? I am stuck now on a GSS exception. – Georg Heiler Jun 17 '20 at 13:52
  • No, back to my original comments, kerberized is super advanced, and would be last on list of things to do.... I am moving at high level just to satisfy reducing admin issues/errors/bugs in just getting a base install of custom services operational in the stack. As you know Flink isn't "supposed" to be in the HDP Stack...I specialize in building custom stacks.... – steven-matison Jun 17 '20 at 14:29
  • 1
    I think most likely https://medium.com/@minyodev/apache-flink-on-yarn-with-kerberos-authentication-adeb62ef47d2 is still missing in my setup. Also there is a conflict with flinks shaded hadoop JARs and flink itselv (at least for me). – Georg Heiler Jun 17 '20 at 15:17
  • Nice updates, I have been busy deploying use case from s3 to Flink.... finally got things working nicely... Everything is a trial/error process with dependencies . Much more work to do. – steven-matison Jun 18 '20 at 23:59

0 Answers0