1

I want to make streaming job in Apache Flink to do Kafka -> Flink -> HIVE in Apache Flink(Scala). Can anyone please give code sample as their official document is not very clear to understand.

This should be streaming process.

  • There's an example in SQL in the docs: https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/connectors/filesystem.html#full-example. – David Anderson Jul 09 '20 at 08:27
  • I need to do scala code, this example is in the SQL query form. is it possible to do that in scala code? Currently, I am trying to write data directly into HDFS in ORC format but HIVE is not reading data written by Flink, not sure why – patel akash Jul 09 '20 at 19:51

1 Answers1

0

For help getting started with the Table API, Real Time Reporting with the Table API is a tutorial you can follow. It's in Java, but the Scala API isn't much different.

This is an example of using SQL to read from Kafka and write to Hive. To do the same from Scala you can wrap the SQL statements with tableEnv.executeSql(...), as in

tableEnv.executeSql("CREATE TABLE Orders (`user` BIGINT, product STRING, amount INT) WITH (...)")

or

val tableResult1 = tEnv.executeSql("INSERT INTO ...")

If you need to do multiple inserts, then you'll need to do it a bit differently, using a StatementSet. See the docs linked to below for details.

See Run a CREATE statement, Run an INSERT statement, Apache Kafka SQL Connector, and Writing to Hive.

If you get stuck, show us what you've tried and how it is failing.

David Anderson
  • 39,434
  • 4
  • 33
  • 60
  • Is it possible to directly write to file using File Streaming into ORC format and read it in HIVE table. https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/connectors/streamfile_sink.html#orc-format I have written the code to write in ORC format but Flink is making 103 KB files and writing correctly. – patel akash Jul 10 '20 at 01:07
  • PS> I need to read data using specific offset from KAFKA and perform some operation and then write to HIVE so is it possible with what you suggested? – patel akash Jul 10 '20 at 01:14
  • You can find my code here https://stackoverflow.com/questions/62826092/i-want-to-write-orc-file-using-flinks-streaming-file-sink-but-it-doesn-t-t-writ – patel akash Jul 10 '20 at 01:36
  • https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/connectors/kafka.html#scan-startup-specific-offsets – David Anderson Jul 10 '20 at 08:08