2

I am using spark structured streaming and reading from Kafka topic. The goal is to write the message to PostgreSQL database multiple tables.

The message schema is:

 root
  |-- id: string (nullable = true)
  |-- name: timestamp (nullable = true)
  |-- comment: string (nullable = true)
  |-- map_key_value: map (nullable = true)
    |-- key: string
    |-- value: string (valueContainsNull = true)

While writing to one table after dropping map_key_value works with below code:

My write code is:

message.writeStream.foreachBatch { (batchDF: DataFrame, batchId: Long) =>
    batchDF.write.format("jdbc").option("url", "url")
      .option("user", "username")
      .option("password", "password")
      .option(JDBCOptions.JDBC_TABLE_NAME, "table_1')
      .mode(SaveMode.Append).save();
  }.outputMode(OutputMode.Append()).start().awaitTermination()

I want to write the message to two DB tables table 1(id, name, comment) and tables 2 need have the map_key_value.

Jacek Laskowski
  • 72,696
  • 27
  • 242
  • 420
JDev
  • 1,662
  • 4
  • 25
  • 55

1 Answers1

1

You will need N streaming queries for N sinks; t1 and t2 both count as a separate sink.

writeStream does not currently write to jdbc so you should use foreachBatch operator.

Jacek Laskowski
  • 72,696
  • 27
  • 242
  • 420
thebluephantom
  • 16,458
  • 8
  • 40
  • 83