Apache Beam KafkaIO has support for kafka consumers to read only from specified partitions. I have the following code.
KafkaIO.<String, String>read()
.withCreateTime(Duration.standardMinutes(1))
.withReadCommitted()
.withBootstrapServers(endPoint)
.withConsumerConfigUpdates(new ImmutableMap.Builder<String, Object>()
.put(ConsumerConfig.GROUP_ID_CONFIG, groupName)
.put(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG, 5)
.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "latest")
.build())
.commitOffsetsInFinalize()
.withTopicPartitions(List<TopicPartitions>)
I have the following 2 questions.
- How do I get the partition names from kafka? How do I mention it in kafkaIO?
- Does Apache beam spawn the number of kafka consumers equal to the partition list mentioned during the creation of the kafka consumer?