Your concerns are valid as the Consumer will also read the messages from the active segment. Log compaction does not guarantee that you have exactly one value for a particular key, but rather at least one.
Here is how Log Compaction is introduced in the documentation:
Log compaction ensures that Kafka will always retain at least the last known value for each message key within the log of data for a single topic partition.
However, you can try to get the compaction running more frequently to have your active and non-compated segment as small as possible. This, however, comes at a cost as running the compaction log cleaner takes up ressources.
There are a lot of configurations at topic level that are related to the log compaction. Here are the most important and all details can be looked-up here:
- delete.retention.ms
- max.compaction.lag.ms
- min.cleanable.dirty.ratio
- min.compaction.lag.ms
- segment.bytes
However, I am quite convinced that you will not be able to guarantee that your consumer is never getting any duplicates with a log compacted topic.