My Verions:
flink.version 1.15.2
scala.binary.version 2.12
java.version 1.11
My Code:
`
public static void main(String[] args) throws Exception { StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
Properties prop = new Properties();
prop.put("commit.offsets.on.checkpoint", "true");
KafkaSource<String> source = KafkaSource.<String>builder()
.setBootstrapServers(kafkaBroker)
.setTopics("inputTopic")
.setGroupId("my-group"+ System.currentTimeMillis())
.setStartingOffsets(OffsetsInitializer.committedOffsets(OffsetResetStrategy.EARLIEST))
.setValueOnlyDeserializer(new SimpleStringSchema())
.setProperties(prop)
.build();
DataStream<String> sourceStream = env.fromSource(
source,
WatermarkStrategy.noWatermarks(), "Kafka Source");
Properties sinkProps = new Properties();
sinkProps.put(ProducerConfig.TRANSACTION_TIMEOUT_CONFIG, 6000);
KafkaSink<String> sink = KafkaSink.<String>builder()
.setBootstrapServers(kafkaBroker)
.setKafkaProducerConfig(sinkProps)
.setRecordSerializer(new OffsetSerializer("outputTopic"))
.setDeliverGuarantee(DeliveryGuarantee.EXACTLY_ONCE)
.setTransactionalIdPrefix("trx-"+System.currentTimeMillis())
.build();
sourceStream
.keyBy(new KeySelector<String, String>(){
@Override
public String getKey(String value) throws Exception {
System.out.println("Key >>" + value);
return value;
}
})
.map(new MapFunction<String, String>() {
@Override public String map(String value) throws Exception {
System.out.println("offset >>" + value);
if(value.equalsIgnoreCase("4"))
{
System.out.println("Custom unhandled exception");
throw new Exception("Custom unhandled exception");
}
return value;
}
})
.sinkTo(sink);
env.enableCheckpointing(500);
env.getCheckpointConfig().setCheckpointTimeout(10000);
env.getCheckpointConfig().setCheckpointingMode(CheckpointingMode.EXACTLY_ONCE);
env.getCheckpointConfig().setMinPauseBetweenCheckpoints(10L);
env.getCheckpointConfig().setTolerableCheckpointFailureNumber(1);
env.getCheckpointConfig().setExternalizedCheckpointCleanup(
ExternalizedCheckpointCleanup.NO_EXTERNALIZED_CHECKPOINTS);
env.setRestartStrategy(RestartStrategies.fixedDelayRestart(1
,org.apache.flink.api.common.time.Time.of(1,TimeUnit.SECONDS)));
env.execute("tester");
} `
I Tried:
- Checked if transactions are enabled working for kafka cluster
- Changed TRANSACTION_TIMEOUT_CONFIG
- Changed consumer group id for every run
- Removing keyby on dataStrea
- Adding
env.getCheckpointConfig().setMaxConcurrentCheckpoints(1); env.getCheckpointConfig().enableUnalignedCheckpoints();
- Using different types of ExternalizedCheckpointCleanup
My Plea:
I am expecting this code to produce one message on Kafka per consumed message. But on exception it is reprocessing messages that successfully completed the stream in a previous run Help!