0

When we use DataStreamReader API for a format in Spark, we specify options for the format used using option/options method. For example, In the below code, I'm using Kafka as the source and passing the configuration required for the source through option method. Here I used only two options - server details and topic configuration. what I'm trying to find out is, what are all other options available for a data source or sink for a particular format, In this case, Kafka. In the case of Kafka format, I could able find a few options which are stated in Kafka guide in Spark documentation, but where can I find other options available for Kafka format. I searched all the Spark documentation for this information but had no luck.

Is there any reference for options available for a data source/sink format in Spark(especially for structured streaming)?

spark
  .readStream
  .format("kafka")
  .option("kafka.bootstrap.servers", "host1:port1")
  .option("subscribe", "topic1")
  .load()
Scarface
  • 359
  • 2
  • 13

1 Answers1

1

You can check Apache Spark official documentation for Input sources and Output Sinks

For Kafka configurations, you can use all Apache Kafka Consumer configurations as the Structured Streaming Kafka Integration guide explains

Zied Yazidi
  • 385
  • 3
  • 9