I want to log a Kafka stream properly and efficiently for investigating my use case, which is presented below.
The problem that I am trying to solve is exposed on github https://github.com/akka/alpakka-kafka/issues/899 as a bug, but it might be that I am doing something wrong, and I am trying to investigate.
In short, whenever I consume from a set of topics with multiple consumers pertaining to the same consumer group (all consumer of the group subscribe to the same set of topics), only half of the consumer do actually perform the consumption, the other simply hang.
The detail of the log is in the ticket.
So I would like to understand why those consumer are not getting any data. What is the exact problem, why they keep logging: FETCH_SESSION_ID_NOT_FOUND
I would like to focus my logging on enough detail to understand that, as opposed to too much detail that makes it difficult to spot where the problem is coming from.
Weirdly at this point tuning the Debug on Akka does not work make any difference. It is only when I turn it on on Apache, that I get debug result but then there are too many of them.
Here is the scheme of my logging configuration:
In my application.conf I have
akka {
loggers = ["akka.event.slf4j.Slf4jLogger"]
loglevel = ${?AKKA_LOG_LEVEL}
loglevel = "INFO"
logging-filter = "akka.event.slf4j.Slf4jLoggingFilter"
}
In my logBack.xml I have
<configuration>
<appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
<encoder>
<pattern>%d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n</pattern>
</encoder>
</appender>
<logger name="org.apache" level="${APACHE_LOG_LEVEL:-INFO}"/>
<logger name="com.elsevier.entellect" level="${APP_LOG_LEVEL:-INFO}"/>
<logger name="akka" level="${AKKA_LOG_LEVEL:-INFO}"/>
<root level="${ROOT_LOG_LEVEL:-INFO}">
<appender-ref ref="STDOUT" />
</root>
</configuration>
I deploy via Kubernetes and inject the environment variables as necessary.
Setting AKKA_LOG_LEVEL to DEBUG, makes no difference at all.
However setting APACHE_LOG_LEVEL to OFF, INFO or DEBUG makes a difference. However at INFO I have basic things about KAFKA as in the post on Github. If I put DEBUG then I get too much things.
More specifically, help figure out what logger at which level do I need to set to at least capture what is happening with the consumer that hang? Are they making request and not getting anything, is there a rebalancing issue?
Note the configuration of my consumers:
val consumerSettings = ConsumerSettings(system, new StringDeserializer, new StringDeserializer)
.withBootstrapServers(conf.kafkaBroker.bootstrapServers)
.withGroupId(conf.kafkaConsumer.groupId)
.withProperty(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, conf.kafkaConsumer.offsetReset)
.withProperty(ConsumerConfig.MAX_POLL_INTERVAL_MS_CONFIG, "1800000")
.withProperty(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, "300000")
.withProperty(ConsumerConfig.HEARTBEAT_INTERVAL_MS_CONFIG, "60000")
.withProperty(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, "1000000")
.withProperty(ConsumerConfig.FETCH_MAX_WAIT_MS_CONFIG, "10000")
This setting is specific to our workload on those consumer, they have to do very long operation.