0

I have a spring cloud stream application that consumes a Kafka topic and eventually update an ElasticSearch index. here is my code:

@Bean
public Consumer<Flux<Message<GraphTextKafkaRecord>>> fetchSeed() {
    return messages -> messages
            .map(message -> {
                var ack = (Acknowledgment) message.getHeaders().get(ACKNOWLEDGEMENT_KEY);
                var graphTextRecord = message.getPayload();
                log.debug("Fetch message from kafka. message: {}", graphTextRecord);
                return Tuples.of(ack, graphTextRecord);
            })
            .filter(tuple -> {
                var result = appRules.test(tuple.getT2());
                if (!result) {
                    tuple.getT1().acknowledge();
                    log.debug("Message: {} has been filtered due to rules defined in application", tuple.getT2());
                }
                return result;
            })
            .map(tuple -> {
                GraphTextKafkaRecord kafkaRecord = tuple.getT2();
                return PageQuality
                        .builder(tuple.getT1(), kafkaRecord.url(), kafkaRecord.pageQuality())
                        .pageQualityAlphaScore(kafkaRecord.pageQualityAlpha())
                        .statusCode(kafkaRecord.statusCode())
                        .build();
            })
            .flatMap(esHandler::insertPageQualityScore)
            .retryWhen(Retry.fixedDelay(5, Duration.ofSeconds(5)))
            .subscribe();
}

and it is my related configuration:

spring:
  cloud:
    stream:
      default-binder: kafka
      kafka:
        binder:
          auto-create-topics: false
          brokers: ${PAGE_QUALITY_PROD_KAFKA_BROKERS:x.x.x.x:9092}
          enable-observation: false
          consumer-properties:
            allow.auto.create.topics: false
        bindings:
          fetchSeed-in-0:
            consumer:
              ack-mode: manual
              enable-dlq: false
              poll-timeout: 21474836470
              start-offset: earliest
      bindings:
        fetchSeed-in-0:
          group: page-quality-group-prod
          destination: ${PAGE_QUALITY_PROD_KAFKA_TOPIC_NAME:graph-text}
          consumer:
            max-attempts: 10
            back-off-initial-interval: 500
            back-off-max-interval: 200
            back-off-multiplier: 2.0
  elasticsearch:
    uris: ${PAGE_QUALITY_PROD_ELASTIC_SEARCH_HOST:http://x.x.x.x:9200}
    username: ${PAGE_QUALITY_PROD_ELASTIC_SEARCH_USERNAME:user}
    password: ${PAGE_QUALITY_PROD_ELASTIC_SEARCH_PASSWORD:pass}
    socket-timeout: 30s
    connection-timeout: 30S

it works well but after a few days it throws the following exception:

org.springframework.kafka.listener.ListenerExecutionFailedException: Listener failed
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.decorateException(KafkaMessageListenerContainer.java:2944)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.doInvokeOnMessage(KafkaMessageListenerContainer.java:2891)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.invokeOnMessage(KafkaMessageListenerContainer.java:2857)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.lambda$doInvokeRecordListener$56(KafkaMessageListenerContainer.java:2780)
    at io.micrometer.observation.Observation.observe(Observation.java:559)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.doInvokeRecordListener(KafkaMessageListenerContainer.java:2778)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.doInvokeWithRecords(KafkaMessageListenerContainer.java:2630)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.invokeRecordListener(KafkaMessageListenerContainer.java:2516)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.invokeListener(KafkaMessageListenerContainer.java:2168)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.invokeIfHaveRecords(KafkaMessageListenerContainer.java:1523)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.pollAndInvoke(KafkaMessageListenerContainer.java:1487)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.run(KafkaMessageListenerContainer.java:1362)
    at java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1804)
    at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: org.springframework.kafka.KafkaException: Failed to execute runnable
    at org.springframework.integration.kafka.inbound.KafkaInboundEndpoint.doWithRetry(KafkaInboundEndpoint.java:75)
    at org.springframework.integration.kafka.inbound.KafkaMessageDrivenChannelAdapter$IntegrationRecordMessageListener.onMessage(KafkaMessageDrivenChannelAdapter.java:461)
    at org.springframework.integration.kafka.inbound.KafkaMessageDrivenChannelAdapter$IntegrationRecordMessageListener.onMessage(KafkaMessageDrivenChannelAdapter.java:425)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.doInvokeOnMessage(KafkaMessageListenerContainer.java:2877)
    ... 12 more
Caused by: org.springframework.messaging.MessageDeliveryException: Dispatcher has no subscribers for channel 'application.fetchSeed-in-0'., failedMessage=GenericMessage [payload=byte[88549], headers={kafka_offset=9184809, kafka_consumer=org.apache.kafka.clients.consumer.KafkaConsumer@19b7a34e, deliveryAttempt=10, kafka_timestampType=CREATE_TIME, kafka_receivedPartitionId=34, kafka_receivedMessageKey=[B@2d72e8c7, kafka_receivedTopic=graph-text, kafka_receivedTimestamp=1678842937842, kafka_acknowledgment=Acknowledgment for graph-text-34@9184809, contentType=application/json, kafka_groupId=page-quality-group-prod}]
    at org.springframework.integration.channel.AbstractSubscribableChannel.doSend(AbstractSubscribableChannel.java:76)
    at org.springframework.integration.channel.AbstractMessageChannel.sendInternal(AbstractMessageChannel.java:373)
    at org.springframework.integration.channel.AbstractMessageChannel.sendWithMetrics(AbstractMessageChannel.java:344)
    at org.springframework.integration.channel.AbstractMessageChannel.send(AbstractMessageChannel.java:324)
    at org.springframework.integration.channel.AbstractMessageChannel.send(AbstractMessageChannel.java:297)
    at org.springframework.messaging.core.GenericMessagingTemplate.doSend(GenericMessagingTemplate.java:187)
    at org.springframework.messaging.core.GenericMessagingTemplate.doSend(GenericMessagingTemplate.java:166)
    at org.springframework.messaging.core.GenericMessagingTemplate.doSend(GenericMessagingTemplate.java:47)
    at org.springframework.messaging.core.AbstractMessageSendingTemplate.send(AbstractMessageSendingTemplate.java:109)
    at org.springframework.integration.endpoint.MessageProducerSupport.lambda$sendMessage$1(MessageProducerSupport.java:262)
    at io.micrometer.observation.Observation.observe(Observation.java:492)
    at org.springframework.integration.endpoint.MessageProducerSupport.sendMessage(MessageProducerSupport.java:262)
    at org.springframework.integration.kafka.inbound.KafkaMessageDrivenChannelAdapter.sendMessageIfAny(KafkaMessageDrivenChannelAdapter.java:394)
    at org.springframework.integration.kafka.inbound.KafkaMessageDrivenChannelAdapter$IntegrationRecordMessageListener.lambda$onMessage$0(KafkaMessageDrivenChannelAdapter.java:464)
    at org.springframework.integration.kafka.inbound.KafkaInboundEndpoint.lambda$doWithRetry$0(KafkaInboundEndpoint.java:70)
    at org.springframework.retry.support.RetryTemplate.doExecute(RetryTemplate.java:329)
    at org.springframework.retry.support.RetryTemplate.execute(RetryTemplate.java:225)
    at org.springframework.integration.kafka.inbound.KafkaInboundEndpoint.doWithRetry(KafkaInboundEndpoint.java:66)
    ... 15 more
Caused by: org.springframework.integration.MessageDispatchingException: Dispatcher has no subscribers, failedMessage=GenericMessage [payload=byte[88549], headers={kafka_offset=9184809, kafka_consumer=org.apache.kafka.clients.consumer.KafkaConsumer@19b7a34e, deliveryAttempt=10, kafka_timestampType=CREATE_TIME, kafka_receivedPartitionId=34, kafka_receivedMessageKey=[B@2d72e8c7, kafka_receivedTopic=graph-text, kafka_receivedTimestamp=1678842937842, kafka_acknowledgment=Acknowledgment for graph-text-34@9184809, contentType=application/json, kafka_groupId=page-quality-group-prod}]

Also it is from the actuator:

{
  "status": "UP",
  "components": {
    "binders": {
      "status": "UP",
      "components": {
        "kafka": {
          "status": "UP",
          "details": {
            "topicsInUse": [
              "graph-text"
            ],
            "listenerContainers": [
              {
                "isPaused": false,
                "listenerId": "KafkaConsumerDestination{consumerDestinationName='graph-text', partitions=0, dlqName='null'}.container",
                "isRunning": true,
                "groupId": "page-quality-group-prod",
                "isStoppedAbnormally": false
              }
            ]
          }
        }
      }
    }
  }
}

As you can see it shows everything is normal but the application can't consumes anymore message.

I don't know where the problem might be.

if you need additional information let me know.

badger
  • 2,908
  • 1
  • 13
  • 32

1 Answers1

1

When reactive function fails and you are not handling the error properly in it the stream breaks and can not be recovered by s-c-stream simply because s-c-stream has no control of it. This is a well known limitation of using reactive API. Basically imperative function is invoked by the framework every time there is a message, hence we have full control of failures and retries. Reactive function is invoked only once during startup to connect the stream provided by the user via user function. Once the stream is connected, s-c-stream plays no role in further processing. You can read more about it here

Oleg Zhurakousky
  • 5,820
  • 16
  • 17
  • Thank you. do you think it is a good idea I remove retryWhen and instead replace with an doOnError? what is your thought? – badger Mar 15 '23 at 08:16
  • 1
    Unfortunately I can't call myself a reactor expert, so consider raising a separate issue. But based on my understanding these are two different operations. One will attempt to recover and another admits and deals with failure, so I think there is a way to actually use both. . . retry and if can't succeed handle error. – Oleg Zhurakousky Mar 15 '23 at 08:50
  • Sorry I'm asking a lot. Do you know why actuator shows nothing when this exception occurs? I have included actuator output. – badger Mar 15 '23 at 11:13