We set the log retention hours
to 1 hour as the following (previously setting was 72H)
Using the following Kafka command line tool, we set the kafka retention.ms
to 1H
. Our aim is to purge the data that is older then 1H in topic - test_topic
, so we used the following command:
kafka-configs.sh --alter \
--zookeeper localhost:2181 \
--entity-type topics \
--entity-name topic_test \
--add-config retention.ms=3600000
and also
kafka-topics.sh --zookeeper localhost:2181 --alter \
--topic topic_test \
--config retention.ms=3600000
Both commands ran without errors.
But the problem is about Kafka data that is older then 1H and still remains!
Actually no data was removed from the topic topic_test
partitions. We have HDP Kafka cluster version 1.0x and ambari
We do not understand why data on topic - topic_test
still remained? and not decreased even after we run both cli as already described
what is wrong on the following kafka cli?
kafka-configs.sh --alter --zookeeper localhost:2181 --entity-type topics --entity-name topic_test --add-config retention.ms=3600000
kafka-topics.sh --zookeeper localhost:2181 --alter --topic topic_test --config retention.ms=3600000
from the Kafka server.log
we ca see the following
2020-07-28 14:47:27,394] INFO Processing override for entityPath: topics/topic_test with config: Map(retention.bytes -> 2165441552, retention.ms -> 3600000) (kafka.server.DynamicConfigManager)
[2020-07-28 14:47:27,397] WARN retention.ms for topic topic_test is set to 3600000. It is smaller than message.timestamp.difference.max.ms's value 9223372036854775807. This may result in frequent log rolling. (kafka.server.TopicConfigHandler)
reference - https://ronnieroller.com/kafka/cheat-sheet