2

I have a kafkaStreams Topology in which there is a Processor API. Inside the processor, there is a logic to call an external API.

Incase the API returns 503, the message tried will need to be retried.

Now, am trying to push this message to a different kafka topic & use the "Punctuate" method to pull batch of messages every minute from the failed topic, retry.

Is there a better way/approach to this problem ?.

Bala.vrad
  • 53
  • 9
  • 1) What happens, if after 1 minute you still get 503? Wouldn’t your logic inflate the failed topic in this case? 2) Do you need to retry asynchronously? – Peyman Jun 22 '20 at 23:24
  • @Peyman - Goal is to retry using exponential backoff retry algorithm. So for each retry count, the destination topic would be different (retry-1, retry-2, retry-3,....). Also it will be retried asynchronously and the "prevRetrytime" is less than the anticipated "nextRetryTime", it will be placed back in the topic. – Bala.vrad Jun 23 '20 at 02:03

1 Answers1

3

A different yet robust approach would be to use a state store. They are backed by Kafka as compacted changelog topics.

You can store failed messages in the state store and process them all by calling schedule (punctuate) and then delete all the successfully processed ones.

For example:

public class MyProcessor {

    private final long schedulerIntervalMs = 60000;
    private final String entityStoreName = "failed-message-store";
    private KeyValueStore<String, Object> entityStore;

    @Override
    public void init(ProcessorContext context) {
        this.entityStore = (KeyValueStore) context().getStateStore(entityStoreName);
        context().schedule(Duration.ofMillis(this.schedulerIntervalMs), PunctuationType.WALL_CLOCK_TIME,
                timestamp -> processFailedMessagesStore());
    }

    @Override
    public void process(String key, Object value) {
        boolean apiCallSuccessful = // call API

        if (!apiCallSuccesfull) {
            entityStore.put(key, value);
        }
    }

    private void processFailedMessagesStore() {
        try (KeyValueIterator<String, Object> allItems = entityStore.all()) {
            allItems.forEachRemaining(item -> {
                boolean successfullyProcessed = // re-process
                
                if (successfullyProcessed) {
                    entityStore.delete(item.key);
                }
            });
        }
    }
}
Nic Pegg
  • 485
  • 3
  • 7
  • Thank you for the answer. Yes I started reading about various StateStores available in KafkaStreams (PersistentStore, GlobalStore, KVStore etc). My concern was, if the pod restarts will the data contained in StateStore get purged. It doesn't look like that :) – Bala.vrad Jun 24 '20 at 05:29
  • 2
    Yes, the data is persisted to disk and to a compacted Kafka topic. If the Pod is wiped out and a new one takes its place, the new kafka streams app will load all state from the compacted topic onto disk (which may take awhile depending on the number of messages) – Nic Pegg Jun 24 '20 at 05:30