0

I am trying to read a kafka topic and write the same in another kafka topic using KafkaSource/KafkaSink in pyflink (flink version 1.16). Reading from kafka topic works and I am able to print the result but when trying to send to kafka using KafkaSink I get the following exception:

NOTE: Picked up JDK_JAVA_OPTIONS: --add-opens java.base/java.lang=ALL-UNNAMED --add-opens java.base/java.util=ALL-UNNAMED --add-opens java.base/java.util.concurrent.atomic=ALL-UNNAMED    
Traceback (most recent call last):
      File "/home/.../PycharmProjects/reddit-anomaly-detection-job/main.py", line 75, in <module>
        main()
      File "/home/.../PycharmProjects/reddit-anomaly-detection-job/main.py", line 49, in main
        kafka_producer = KafkaSink.builder() \
      File "/home/.../.conda/envs/reddit-anomaly-detection-job/lib/python3.9/site-packages/pyflink/datastream/connectors/kafka.py", line 963, in set_record_serializer
        get_field_value(j_topic_selector, 'topicSelector').getClass().getCanonicalName()
    AttributeError: 'NoneType' object has no attribute 'startswith'

The code is:

# Create a Kafka producer using the SimpleStringSchema for serialization
record_serializer = KafkaRecordSerializationSchema.builder() \
    .set_topic(kafka_sink_topic) \
    .set_value_serialization_schema(SimpleStringSchema()) \
    .build()

kafka_producer = KafkaSink.builder() \
    .set_bootstrap_servers(bootstrap_servers) \
    .set_record_serializer(record_serializer) \
    .build()

UPDATE: It seems like the problem is from the local env. The same code runs in ververica on top of a custom python image. I tried to follow this article but with kafka and it is not working locally in PyCharm

Monika X
  • 322
  • 4
  • 13

0 Answers0