1

I am trying to create a flink consumer for a kafka avro serialized topic. I have the kafka topic streaming avro serialized data. I can see it via the avroconsoleconsumer.

The Flink 1.6.0 has added an AvroDeserializationSchema but I can not find a complete example of its usage. Yes there are a few that generate an avrodeserialization class seemingly prior to 1.6.0 added the class.

I have an avro class generated via the avro-tools.

right now i have been trying to following the examples that exist but they are different enough that I can't get things going. (I don't program in Java that often)

Most use some form of the following

Myclass mc = new MyClass();
AvroDeserializationSchema<Myclass> ads = new AvroDeserializationSchema<> (Myclass.class);
FlinkKafkaConsumer010<Myclass> kc = new FlinkKafkaConsumer010<>(topic,ads,properties);

where Myclass is a avro class generated via the avro-tools jar. Is this the correct way to go? I am experiencing some private/public access issues when doing this and leveraging the internal flink 1.6.0 avrodeserializationschema class. Do I have to create a new class and extend the avrodeserializationschema?

Peter Csala
  • 17,736
  • 16
  • 35
  • 75
Chris P
  • 31
  • 4
  • I think this question is not flink related. You should look for the answer in kafkajdbcconnector project documentation. – Dawid Wysakowicz Aug 22 '18 at 07:32
  • I edited the question as i have narrowed it down to avrodeserialization via the flink avrodeserializationschema and flinkkafkaconsumer classes. – Chris P Aug 26 '18 at 15:45

1 Answers1

0

OK, I dug into the kafka consumer javadocs and found an example to pull consume the avro stream. I still have to convert the kafka consumption to flinkKafkaConsumer but the code below works.

For the io.confluent references to work I had to adda repository and a dependency to the pom file.

<repository>
  <id>confluent</id>
    <url>http://packages.confluent.io/maven/</url>
</repository>

<dependency>
    <groupId>io.confluent</groupId>
    <artifactId>kafka-avro-serializer</artifactId>
    <version>3.1.1</version>
</dependency>   
public class StreamingJob {

//  static  DeserializationSchema<pendingsv> avroSchema = new AvroDeserializationSchema<pendingsv>(pendingsv.class);
    public static void main(String[] args) throws Exception {
        // set up the streaming execution environment
        final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
   
        Properties props = new Properties();
        props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
        props.put(ConsumerConfig.GROUP_ID_CONFIG, "opssupport.alarms");
        props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringDeserializer");
        props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, "io.confluent.kafka.serializers.KafkaAvroDeserializer");
        props.put("schema.registry.url", "http://localhost:8081");
        props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
        String topic = "pendingSVs_";
        final Consumer<String, GenericRecord> consumer = new KafkaConsumer<String, GenericRecord>(props);
        consumer.subscribe(Arrays.asList(topic));

        try {
            while (true) {
                ConsumerRecords<String, GenericRecord> records = consumer.poll(100);
                for (ConsumerRecord<String, GenericRecord> record : records) {
                    System.out.printf("offset = %d, key = %s, value = %s \n", record.offset(), record.key(), record.value());
                }
            }
        } finally {
            consumer.close();
        }

        // execute program
        //env.execute("Flink Streaming Java API Skeleton");
    }
}
Peter Csala
  • 17,736
  • 16
  • 35
  • 75
Chris P
  • 31
  • 4