1

I'm using a compacted topic in kafka which I load into a HashMap at the application startup. Then I'm listening to a normal topic for messages, and processing them using the HashMap constructed from the compacted topic.

How can I make sure the compacted topic is fully read and the HashMap fully initialized before starting to listen to the other topics ? (Same for RestControllers)

JohnD
  • 395
  • 1
  • 6
  • 18

2 Answers2

1

Implement SmartLifecycle and load the map in start(). Make sure the phase is earlier than any other object that needs the map.

Gary Russell
  • 166,535
  • 14
  • 146
  • 179
  • Thanks, but in a KafkaListener, can I access to the partition offset ? In order to know if I finished to init the map by reading all the partitions till the last offset ? And The KafkaListener will only be triggerred for that specific bean even if messages arrives in an other listener class ? – JohnD May 07 '19 at 13:20
  • You shouldn't use a `KafkaListener` for this; use a raw `Consumer` and `poll()` it until you get no more records. But, yes, you can get the offset etc.; add `@Header` parameters. – Gary Russell May 07 '19 at 13:28
  • `KafkaListener` won't work in this case ? Or it's just better to use `Consuler.poll()` ? And I still need to update the map when updates occur (new record) so the consumer should stay alive – JohnD May 07 '19 at 13:40
  • The problem is all `@KafkaListener`s are started in the same phase; and the start() does not block, it simply starts the consumer. You would also have to listen for an idle event to figure out when you are done. It's simpler to poll the consumer until the `poll()` returns no more records (after you seek to beginning, of course). You can always use a `@KafkaListener` as well, to get future updates, just be sure to commit the offset before closing your initial consumer and put the listener in the same consumer group. – Gary Russell May 07 '19 at 15:31
  • You could set the autoStartup property for all the other listeners to false; and start them manually via the registry, once your map is built. – Gary Russell May 07 '19 at 15:31
  • Thanks for the help, I tried to disable autoStart on compacted topic and manually start them before the `AbstractMessageListenerContainer` starts the others. But like you said I'm having trouble to identify when I am "done". I don't really want to commit an offset though since I want to seek from the beginning each time and would also need an unique group id per app instance which I can't due to restriction on group usage – JohnD May 07 '19 at 15:53
  • Also using the `poll()` would need me to duplicate some config. Or can I get an automatically configured kafka consumer to use `poll()` on it from spring-kafka without any further configuration ? – JohnD May 07 '19 at 16:02
  • You can detect a `containerIdleEvent` (set the `idleEventInterval`) to detect when there are no more records. Committing the offset won't prevent you from seeking to beginning next time and it will allow the "changes" listener to just get changes. You can simply auto-wire the consumer factory and create a consumer from it. – Gary Russell May 07 '19 at 16:16
0

This is an old question, I know, but I wanted to provide a more complete code sample of a solution that I ended up with when I struggled with this very problem myself.

The idea is that, like Gary has mentioned in the comments of his own answer, a listener isn't the correct thing to use during initialization - that comes afterwards. An alternative to Garry's SmartLifecycle idea, however, is InitializingBean, which I find less complicated to implement, since it's only one method: afterPropertiesSet():

@Slf4j
@Configuration
@RequiredArgsConstructor
public class MyCacheInitializer implements InitializingBean {

    private final ApplicationProperties applicationProperties; // A custom ConfigurationProperties-class
    private final KafkaProperties kafkaProperties;
    private final ConsumerFactory<String, Bytes> consumerFactory;
    private final MyKafkaMessageProcessor messageProcessor;

    @Override
    public void afterPropertiesSet() {
        String topicName = applicationProperties.getKafka().getConsumer().get("my-consumer").getTopic();
        Duration pollTimeout = kafkaProperties.getListener().getPollTimeout();

        try (Consumer<String, Bytes> consumer = consumerFactory.createConsumer()) {
            consumer.subscribe(List.of(topicName));

            log.info("Starting to cache the contents of {}", topicName);

            ConsumerRecords<String, Bytes> records;

            do {
                records = consumer.poll(pollTimeout);
                records.forEach(messageProcessor::process);
            } while (!records.isEmpty());
        }

        log.info("Completed caching {}", topicName);
    }
}

For brevity's sake I'm using Lombok's @Slf4j and @RequiredArgsConstructor annotations, but those can be easily replaced. The ApplicationProperties class is just my way of getting the topic name I'm interested in. It can be replaced with something else, but my implementation uses Lombok's @Data annotation, and looks something like this:

@Data
@Configuration
@ConfigurationProperties(prefix = "app")
public class ApplicationProperties {

    private Kafka kafka = new Kafka();

    @Data
    public static class Kafka {
        private Map<String, KafkaConsumer> consumer = new HashMap<>();
    }

    @Data
    public static class KafkaConsumer {
        private String topic;
    }
}
Thomas Kåsene
  • 5,301
  • 3
  • 18
  • 30