5

I have a storm topology to process messages from Kafka and make HTTP call / saves in Cassandra based on the task in hand. I process the messages as soon as they come. How ever few messages are not processed completely due to the response form external sources such as an HTTP. I would like to implement a exponential backoff mechanism for retrial in-case HTTP server does not respond/returns an error message to retry after some time. I could think of few ideas using which I could achieve them. I would like to know which of them will be a better solution also if there is any other solution that I can use which is fault tolerant. Since this is used to implement an exponential backoff each message will have a different delay time.

  • Send it another topic in Kafka which is consumed later. My preferred Solution. I know we can use Kafka offset so consume the message at a latter stage. How ever I could not find documentation/Sample code to do the same. It will be really helpful if any one can help me out with this.
  • Write the message Cassandra / Redis and write a scheduler to fetch the messages which are not processed and are ready to be consumed and Send it to Kafka so that my storm topology can consume it. (Existing solution in other legacy project(Non Storm))
  • Send to Beanstalk with Delay (Existing solution in other legacy project(Non Storm). How ever I would like to avoid using this solution and use it only in case I am out of option).

While this is pretty much what I would like to do. I am not able to find documentation to implement delayProcessingUntil as mentioned in Kafka - Delayed Queue implementation using high level consumer

I have done scheduled job from Data-store and delay using Beanstalk in the past, but I would prefer to use Kafka.

Community
  • 1
  • 1
Ankit Gupta
  • 565
  • 6
  • 18

3 Answers3

1

I think your use case describes the need for a database rather than a queue. You want to temporarily store records until their time and then remove them so they don't show up in future searches. Trying to do that in a queue would be awkward at best, as your analysis shows.

I suggest you create another column family in Cassandra to hold these delayed requests. You'd store the request itself along with a time to retry. Whether you'd want to also have a time series of failed HTTP attempts and related data is up to you. As a delayed request is finally fulfilled, you'd delete the corresponding row from the CF. The search for delayed requests is straightforward, too.

Of course, any database, even a file on the local drive or in HDFS could work, too.

Chris Gerken
  • 16,221
  • 6
  • 44
  • 59
1

Kafka spout has an exponential backoff message retry built-in. You can configure initial delay, delay multiplier and maximum delay through spout configuration. If there is an error in the bolt, you can call collector.fail(input). After that you just leave it to spout to do the retry.

https://github.com/apache/storm/blob/v0.10.0/external/storm-kafka/src/jvm/storm/kafka/ExponentialBackoffMsgRetryManager.java

hobgoblin
  • 865
  • 1
  • 10
  • 18
  • Apologies for late reply. As suggested, I configured initial delay, delay multiplier and maximum delay through spout configuration and called collector.fail(input). My setup is Spout -> Bolt 1 -> Bolt 2-> Bolt 3. I want to restart the process if the execution fails in any of the Bolts. While the suggested solutions works fine when the error collector.fail(input) is called from Bolt 1 and the Back-off works perfectly. However when I call collector.fail(input) from Bolt 2 or Bolt 3, the retry is not working. Also if I don't call collector.ack(input) from Bolt 1, it does a retry after some time. – Ankit Gupta Apr 07 '16 at 15:06
  • I am sending data from one bold to another using the following code. `collector.emit(streamId,new Values(data));` `declarer.declareStream(streamId,new Fields(field_name));` – Ankit Gupta Apr 08 '16 at 07:01
  • It will really helpful if you can help me out here @abhishek-agarwal – Ankit Gupta Apr 11 '16 at 10:14
  • @AnkitGupta - You have to anchor the tuples you are sending from one bolt to other bolt. Change your code in bolt to collector.emit(streamId, , new Values(data)) – hobgoblin Apr 11 '16 at 13:24
  • Is there a way I can modify the tuple and change the value for attempt number or restrict the number of retries? – Ankit Gupta Apr 18 '16 at 06:38
  • I want to call `collector.ack(input)` on the 3rd attempt even if its a failure. – Ankit Gupta Apr 18 '16 at 06:39
  • No. That is not possible right now. But it will be available soon https://github.com/apache/storm/pull/1331 – hobgoblin May 04 '16 at 09:55
0

You might be interested in the Kafka Retry project https://github.com/IBM/kafka-retry. It provides a delayed retry queue using a single retry topic.

Boon
  • 1,073
  • 1
  • 16
  • 42