Records in a Kafka topic getting deleted automatically while producer is still running

Question

I am seeing a very weird issue with my Kafka topic, I have a Spring Cloud Kafka stream app where I am reading from INPUT_TOPIC with around 20 million records and then grouping records by keys and aggregating based on values of other fields into a single string value and pushing that data to an output topic. I am doing this so I can use all the topic data as lookup or join with another topic, so I can read AGGREGATED_OUTPUT_TOPIC as a GlobalKStore to be joined with another topic based on keys. If I don't aggregate this way I lose the same keyed values to only latest values when I load INPUT_TOPIC without aggregating. Anyway my code is as below.

KStream.groupByKey()
                .windowedBy(TimeWindows.of(Duration.ofMinutes(30)))
                .aggregate(() -> initStr,
                        (key, value, agg) -> {
                            return agg + "::" + value.getSequence() + "|" + value.getComment()+"|"+value.getDtEntered();
                        },
                        Materialized.with(Serdes.String(), Serdes.String()))
                .toStream()
                .selectKey((k,v)->k.toString().substring(1).split("@")[0])
                .peek(((key, value) -> log.info("key: {}, value: {}", key, value)))
                .to("AGGREGATED_OUTPUT_TOPIC");

My config looks like this:

spring:
  cloud:
    stream:
      schemaRegistryClient:
        endpoint: https://kafka.shared.internal.xxxx.com.au:8081
      bindings:
        kstream_input_channel:
          destination: INPUT_TOPIC
      kafka:
        streams:
          binder:
            applicationId: appID
            brokers: b-2.shared.twluaa.c3.kafka.ap-southeast-2.amazonaws.com:9096,b-1.shared.twluaa.c3.kafka.ap-southeast-2.amazonaws.com:9096,b-3.shared.twluaa.c3.kafka.ap-southeast-2.amazonaws.com:9096
            configuration:
              auto.offset.reset: earliest
              schema.registry.url: https://kafka.shared.internal.com.au:8081
              security:
                protocol: SASL_SSL
              sasl:
                mechanism: SCRAM-SHA-512
                jaas:
                  config: org.apache.kafka.common.security.scram.ScramLoginModule required username="clouduser" password="dummy";
              commit.interval.ms: 20000
              state.dir: state-store
              default:
                key.serde: org.apache.kafka.common.serialization.Serdes$StringSerde
                value.serde: io.confluent.kafka.streams.serdes.avro.SpecificAvroSerde
          bindings:
            kstream_input_channel:
              consumer:
                keySerde: org.apache.kafka.common.serialization.Serdes$StringSerde
                valueSerde: io.confluent.kafka.streams.serdes.avro.SpecificAvroSerde
                startOffset: earliest

**Now I see that my aggregating app runs and consumes data fine and also aggregates fine, except when it stops reading after sometime and prints this

stream-thread [appID-9a5fe835-835e-45c6-9d96-5799d1cb58b7-StreamThread-1] Processed 935164 total records, ran 0 punctuators, and committed 12 total tasks since the last update **

After this I see that the count on the output topic actually starts to reduce gradually till 0 I check this on AKHQ by refreshing on the topic, so if it had already grouped and aggregated say 5 million records, and I start seeing this exception and it starts to delete data from topic unless I stop the app, I see the topic size goto 0.

I checked this by removing the windowedBy part of the code but it is still the same any idea if the way the code is written is causing this, something like the statestore is being emptied on my dev machine where the app is run and that causes topic to empty? Just guessing as I am out of ideas...

I have also checked and updated all topic config like below

cleanup.policy  delete  DEFAULT_CONFIG
compression.type    gzip    STATIC_BROKER_CONFIG
delete.retention.ms 1 days  DEFAULT_CONFIG
file.delete.delay.ms    1 days  DYNAMIC_TOPIC_CONFIG
flush.messages  9223372036854776000 DEFAULT_CONFIG
flush.ms    292271023 years 2 weeks DEFAULT_CONFIG
follower.replication.throttled.replicas     DEFAULT_CONFIG
index.interval.bytes    4096    DEFAULT_CONFIG
leader.replication.throttled.replicas       DEFAULT_CONFIG
max.compaction.lag.ms   292271023 years 2 weeks DEFAULT_CONFIG
max.message.bytes   1048588 DEFAULT_CONFIG
message.downconversion.enable   true    DEFAULT_CONFIG
message.format.version  2.7-IV2 STATIC_BROKER_CONFIG
message.timestamp.difference.max.ms 292271023 years 2 weeks DEFAULT_CONFIG
message.timestamp.type  CreateTime  DEFAULT_CONFIG
min.cleanable.dirty.ratio   0.5 DEFAULT_CONFIG
min.compaction.lag.ms   0 seconds   DEFAULT_CONFIG
min.insync.replicas 2   STATIC_BROKER_CONFIG
preallocate false   DEFAULT_CONFIG
retention.bytes -1  DEFAULT_CONFIG
retention.ms    1 weeks DEFAULT_CONFIG
segment.bytes   1073741824  DEFAULT_CONFIG
segment.index.bytes 10485760    DEFAULT_CONFIG
segment.jitter.ms   0 seconds   DEFAULT_CONFIG
segment.ms  1 weeks DEFAULT_CONFIG
unclean.leader.election.enable  false

Also I have no consumer running that reads from the output topic AGGREGATED_OUTPUT_TOPIC either.

PPS: I had a an old question but deleted that and rephrasing it here.

Records in a Kafka topic getting deleted automatically while producer is still running

0 Answers0