3

I run Kafka and a KSQLDB server on headless mode. On the KSQLDB Server, I have only deployed a couple of queries to experiment with:

CREATE STREAM pageviews_original (viewtime bigint, userid varchar, pageid varchar) WITH (kafka_topic='pageviews-ksql', PARTITIONS=1, REPLICAS=3, value_format='DELIMITED');

CREATE TABLE users_original (registertime BIGINT, gender VARCHAR, regionid VARCHAR, userid VARCHAR) WITH (kafka_topic='users-ksql', PARTITIONS=1, REPLICAS=3, value_format='JSON', key = 'userid');

CREATE STREAM pageviews_enriched AS SELECT users_original.userid AS userid, pageid, regionid, gender FROM pageviews_original LEFT JOIN users_original ON pageviews_original.userid = users_original.userid;

My problem is that the the KSQLDB server is now constantly logging this INFO message:
"found no committed offset for partition _confluent-ksql-ksql-01query_CSAS_PAGEVIEWS_ENRICHED_0-Join-repartition-0".

It's spamming the logs with this message about 10 times per second. The corresponding topic is empty.

What does this mean and how can I fix it?

GaryW
  • 57
  • 1
  • 1
  • 5

1 Answers1

14

The log message is output when a streams thread, (a thread that does the stream processing), is assigned a topic-partition to start processing. Before it starts the processing it first checks to see if there are any committed offsets, so that it can start processing from where a previous thread finished.

It's normal to such log lines when creating a stream or table as there haven't been any previous threads processing the partition, so there are no offsets committed.

You may also see such log lines upon restarting your server, or during consumer group rebalancing (more on this below), if no data has been processed through the partition yet.

Where data has previously been processed you may see similar log lines, but including details of the last processed offset.

What is not normal is to be seeing them all the time! This suggests something is wrong.

The most likely cause is consumer group rebalancing.

Consumer groups handle spreading the load across all available stream processing threads, across all clustered ksqlDB servers. When a server is added or removed from the cluster the group reblances to ensure all topic partitions are being processed and work is spread evenly across all instances. There are configurable timeouts used to detect dead consumers.

It could be that your consumer groups are unstable and this is causing constant rebalances and hence these log messages. Even then, I wouldn't expect 10s of log lines per second, unless there are many active queries or a high number of topic partitions.

If there are consumer group rebalances going on then you should see this in the logs, though you may need to adjust the logging levels to see them.

There's plenty of information on the net around causes and fixes for unstable consumer groups.

Markus Pscheidt
  • 6,853
  • 5
  • 55
  • 76
Andrew Coates
  • 1,775
  • 1
  • 10
  • 16
  • 2
    The message eventually went away after restarting the server. But your answer is the closest thing to understanding what happened. Thank you. – GaryW May 23 '20 at 16:40
  • No problem - glad it helped. – Andrew Coates May 23 '20 at 17:25
  • i'm also having this problem and restarting does not fix it. – Tuna Yagci May 04 '21 at 15:51
  • It sends this log when the consumer sends a fetch and: "if there is no offset associated with a topic-partition under that consumer group the broker does not set an error code (since it is not really an error), but returns empty metadata and sets the offset field to -1." – Tuna Yagci May 06 '21 at 09:40
  • I experienced this when I had two different processes using the same client id. That's a mistake that can easily happen. – earthling42 Jun 18 '21 at 00:04
  • There was a bug in spring actuator: https://github.com/spring-cloud/spring-cloud-stream/issues/2223 Problem will be resolved in the next release probably. – wet_waffle Oct 07 '21 at 13:44