Streaming application is rolled out in production and right after 10 days observing errors/warnings in the CustomProductionExceptionHandler for expired transactions which belongs to older day window.
FLOW :
INPUT TOPIC --> STREAMING APPLICATION(Produces stats and emits after day window closed) --> OUTPUT TOPIC
Producer continuously trying to publish records to OUTPUT Topic which is already expired with older window and logs an error into CustomProductionExceptionHandler.
I have reduced batch size and kept default but this change is not yet promoted to production.
CustomProductionExceptionHandler Implementation: To Avoid streaming to die due to NeworkException,TimeOutException.
With this implementation producer does not retry and in case of any exceptions it does CONTINUE.. On other side upon returning FAIL.. stream thread dies and does not auto restart..Need suggestions..
public class CustomProductionExceptionHandler implements ProductionExceptionHandler {
@Override
public ProductionExceptionHandlerResponse handle(final ProducerRecord<byte[], byte[]> record,
final Exception exception) {
String recordKey = new String(record.key());
String recordVal = new String(record.value());
String recordTopic = record.topic();
logger.error("Kafka message marked as processed although it failed. Message: [{}:{}], destination topic: [{}]", recordKey,recordVal,recordTopic,exception);
return ProductionExceptionHandlerResponse.CONTINUE;
}
}
Exception:
2019-12-20 16:31:37.576 ERROR com.jpmc.gpg.exception.CustomProductionExceptionHandler.handle(CustomProductionExceptionHandler.java:19) kafka-producer-network-thread | profile-day-summary-generator-291e69b1-5a3d-4d49-8797-252c2ae05607-StreamThread-19-producerid - Kafka message marked as processed although it failed. Message: [{"statistics":{}], destination topic: [OUTPUT-TOPIC]
org.apache.kafka.common.errors.TimeoutException: Expiring * record(s) for TOPIC:1086149 ms has passed since batch creation
Trying to get answer for below questions.
1) Why producer is trying to publish older transactions to OUTPUT Topic for which day window is already closed?
Example - Producer is trying to send 12/09 day window transaction but current opened window is 12/20
2) Streaming threads could have been died without CustomProductionExceptionHandler --> ProductionExceptionHandlerResponse.CONTINUE. Do we have any way that Producer can do retries in case of NetworkException or TimeoutException and then continue instead of stream thread die? Problem of specifying ProductionExceptionHandlerResponse.CONTINUE in the CustomProductionExceptionHandler is - In case of any exception it skips that record publishing to output topic and proceed with next records. No Resiliency.