1

I followed this link to create a template which builds a beam pipeline to read from KafkaIO. But I always encountered " incompatible types: org.apache.beam.sdk.options.ValueProvider cannot be converted to java.lang.String". It is line ".withBootstrapServers(options.getKafkaServer())" which caused the error. Beam version is 2.9.0 and here is part of my code.

public interface Options extends PipelineOptions {
    @Description("Kafka server")
    @Required
    ValueProvider<String> getKafkaServer();

    void setKafkaServer(ValueProvider<String> value);

    @Description("Topic to read from")
    @Required
    ValueProvider<String> getInputTopic();

    void setInputTopic(ValueProvider<String> value);

    @Description("Topic to write to")
    @Required
    ValueProvider<String> getOutputTopic();

    void setOutputTopic(ValueProvider<String> value);

    @Description("File path to write to")
    @Required
    ValueProvider<String> getOutput();

    void setOutput(ValueProvider<String> value);
}

public static void main(String[] args) {
    Options options = PipelineOptionsFactory.fromArgs(args).withValidation().as(Options.class);
    Pipeline p = Pipeline.create(options);

    PCollection<String> processedData = p.apply(KafkaIO.<Long, String>read()
            .withBootstrapServers(options.getKafkaServer())
            .withTopic(options.getInputTopic())
            .withKeyDeserializer(LongDeserializer.class)
            .withValueDeserializer(StringDeserializer.class)
            .withoutMetadata() 
    )

And following is how I run the code:

mvn compile exec:java \
-Dexec.mainClass=${MyClass} \
-Pdataflow-runner -Dexec.args=" \
--project=${MyClass} \
--stagingLocation=gs://${MyBucket}/staging \
--tempLocation=gs://${MyBucket}/temp \
--templateLocation=gs://${MyBucket}/templates/${MyClass} \
--runner=DataflowRunner"
sheng666
  • 49
  • 6

2 Answers2

1

In order to access a value via ValueProvider, you need to use the get method, and then you get the value with its concrete type.

For example: when having option:

ValueProvider<String> getKafkaServer();

you can access it with:

getKafkaServer().get() this will return you String object.

Seems like the KafkaIo Api requires to get string parameter and not ValueProvider, you have to extract the value from the ValueProvider wrapper.

Brachi
  • 637
  • 9
  • 17
0

I might find the issue, which is that kafkaIO is not supported. Following are from Google create template.

" Some I/O connectors contain methods that accept ValueProvider objects. To determine support for a specific connector and method, see the API reference documentation for the I/O connector. Supported methods have an overload with a ValueProvider. If a method does not have an overload, the method does not support runtime parameters. The following I/O connectors have at least partial ValueProvider support:

File-based IOs: TextIO, AvroIO, FileIO, TFRecordIO, XmlIO BigQueryIO* BigtableIO (requires SDK 2.3.0 or later) PubSubIO SpannerIO "

sheng666
  • 49
  • 6
  • FYI, by using flex templates, one can us any kind of connector and avoid having to deal with ValueProviders. https://cloud.google.com/dataflow/docs/guides/templates/using-flex-templates – robertwb Oct 01 '21 at 23:57