1

I created a custom class that is based on Apache Flink. The following are some parts of the class definition:

public class StreamData {
    private StreamExecutionEnvironment env;
    private DataStream<byte[]> data ;
    private Properties properties;
    public StreamData(){
        env = StreamExecutionEnvironment.getExecutionEnvironment();
    }


    public StreamData(StreamExecutionEnvironment e , DataStream<byte[]> d){
    env = e ;
    data = d ;
}
    public StreamData getDataFromESB(String id, int from) {

        final Pattern TOPIC = Pattern.compile(id);

        Properties properties = new Properties();
        properties.setProperty("bootstrap.servers", "localhost:9092");
        properties.setProperty("group.id", Long.toString(System.currentTimeMillis()));
        properties.setProperty("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
        properties.setProperty("value.deserializer", "org.apache.kafka.common.serialization.ByteArrayDeserializer");
        properties.put("metadata.max.age.ms", 30000);
        properties.put("enable.auto.commit", "false");

        if (from == 0)
            properties.setProperty("auto.offset.reset", "earliest");
        else
            properties.setProperty("auto.offset.reset", "latest");


        StreamExecutionEnvironment e = StreamExecutionEnvironment.getExecutionEnvironment();
        DataStream<byte[]> stream = env
                .addSource(new FlinkKafkaConsumer011<>(TOPIC, new AbstractDeserializationSchema<byte[]>() {
                    @Override
                    public byte[] deserialize(byte[] bytes) {
                        return bytes;
                    }
                }, properties));
        return new StreamData(e, stream);
    }
    public void print(){
        data.print() ;
    }

    public void execute() throws Exception {
        env.execute() ;
    }

Using class StreamData, trying to get some data from Apache Kafka and print them in the main function:

StreamData stream = new StreamData();
        stream.getDataFromESB("original_data", 0);
        stream.print();
        stream.execute();

I got the error:

Exception in thread "main" org.apache.flink.api.common.InvalidProgramException: The implementation of the FlinkKafkaConsumer010 is not serializable. The object probably contains or references non serializable fields.
Caused by: java.io.NotSerializableException: StreamData

As mentioned here, I think it's because of some data type in getDataFromESB function is not serializable. But I don't know how to solve the problem!

Soheil Pourbafrani
  • 3,249
  • 3
  • 32
  • 69

2 Answers2

4

Your AbstractDeserializationSchema is an anonymous inner class, which as a result contains a reference to the outer StreamData class which isn't serializable. Either let StreamData implement Serializable, or define your schema as a top-level class.

Chesnay Schepler
  • 1,250
  • 7
  • 9
  • I solved that: `public class StreamData implements Serializable { private transient StreamExecutionEnvironment env; private DataStream data ; private Properties properties;` – Soheil Pourbafrani Mar 24 '18 at 17:45
  • In my case I marked my `DeserializationSchema` class as `static`, this way I could keep it as an inner class and the error went away. – Mass Dosage Jul 06 '21 at 11:07
0

It seems that you are importing FlinkKafkaConsumer010 in your code but using FlinkKafkaConsumer011. Please use the following dependency in your sbt file:

"org.apache.flink" %% "flink-connector-kafka-0.11" % flinkVersion
ankita.gulati
  • 899
  • 1
  • 8
  • 15
  • I had added this dependency in maven. I think it's because of `flink-connector-kafka-0.11` design that it's talk about `FlinkKafkaConsumer010` class. – Soheil Pourbafrani Mar 16 '18 at 16:56