3

I am creating a Kafka-based Flink streaming application, and am trying to create an associated KafkaSource connector in order to read Kafka data.

For example:

final KafkaSource<String> source = KafkaSource.<String>builder()
     // standard source builder setters
     // ...
     .setProperty(SslConfigs.SSL_TRUSTSTORE_LOCATION_CONFIG, "truststore.jks")
     .build();

The truststore.jks file is created locally on the job manager node before the application is executed, and I've verified that it exists and is correctly populated. My problem is that, in a distributed Flink application, this truststore.jks does not automatically also exist on the task worker nodes, so the above code results in a FileNotFoundException when executed.

What I've tried:

  • Use env.registerCacheFile and getRuntimeContext().getDistributedCache().getFile() in order to distribute the file to all nodes, but since the graph is being built and the application is not yet running, the RuntimeContext is not available at this stage.
  • Supply a base64 parameter representation of the truststore, and manually convert it to .jks format. I'd need some sort of "pre-initialization" KafkaSource hook to do this, and haven't found any such functionality in the docs.
  • Use an external data store, such as s3, and retrieve the file from there. As far as I can tell, the internal Kafka consumer does not support non-local filesystems, so I'd still need some pre-initialization way to retrieve the file locally on each task node.

What is the best way to make this file available to task worker nodes during the source initialization?

I have read similar questions posted here before:

  1. how to distribute files to worker nodes in apache flink
  • As explained above, I don't have access to the RuntimeContext at this point in the application.
  1. Flink Kafka Connector SSL Support
  • This injects the truststore as a base64 encoded string parameter. I could do this, but since the internal Kafka consumer expects a file, I would have the problem of converting the parameter to .jks format before consumer initialization. I don't see a way of registering a "pre-initialization" hook for the KafkaSource in the docs.
pika
  • 93
  • 1
  • 6

1 Answers1

1

Update:

I was able to work around this issue by instead using the ssl.truststore.certificates configuration field. This allows me to supply a base64-encoded representation of the underlying truststore.jks certificate instead of a local file path.

[I also had to update my kafka-clients dependency to 2.7.x+ as this configuration is not available in older versions of the library]

pika
  • 93
  • 1
  • 6