I want to copy all messages from a topic in Kafka cluster. So I ran Kafka Mirrormaker however it seems to have copied roughly only half of the messages from the source cluster (I checked that there's no consumer lag in source topic). I have 2 brokers in the source cluster does this have anything to do with this?
This is the source cluster config:
log.retention.ms=1814400000
transaction.state.log.replication.factor=2
offsets.topic.replication.factor=2
auto.create.topics.enable=true
default.replication.factor=2
min.insync.replicas=1
num.io.threads=8
num.network.threads=5
num.partitions=1
num.replica.fetchers=2
replica.lag.time.max.ms=30000
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
socket.send.buffer.bytes=102400
unclean.leader.election.enable=true
zookeeper.session.timeout.ms=18000
The source topic has 4 partitions and is not compacted. The Mirrormaker config is:
- mirrormaker-consumer.properties
bootstrap.servers=broker1:9092,broker2:9092
group.id=picturesGroup3
auto.offset.reset=earliest
- mirrormaker-producer.properties
bootstrap.servers=localhost:9092
max.in.flight.requests.per.connection=1
retries=2000000000
acks=all
max.block.ms=2000000000
Below are the stats from Kafdrop on the source cluster topic:
Partition | First Offset | Last Offset | Size | Leader Node | Replica Nodes | In-sync Replica Nodes | Offline Replica Nodes | Preferred Leader | Under-replicated |
---|---|---|---|---|---|---|---|---|---|
0 | 13659 | 17768 | 4109 | 1 | 1 | 1 | Yes | No | |
1 | 13518 | 17713 | 4195 | 2 | 2 | 2 | Yes | No | |
2 | 13664 | 17913 | 4249 | 1 | 1 | 1 | Yes | No | |
3 | 13911 | 18072 | 4161 | 2 | 2 | 2 | Yes | No |
and these are the stats for the target topic after Mirrormaker run:
Partition | First Offset | Last Offset | Size | Leader Node | Replica Nodes | In-sync Replica Nodes | Offline Replica Nodes | Preferred Leader | Under-replicated |
---|---|---|---|---|---|---|---|---|---|
0 | 2132 | 4121 | 1989 | 1 | 1 | 1 | Yes | No | |
1 | 2307 | 4217 | 1910 | 1 | 1 | 1 | Yes | No | |
2 | 2379 | 4294 | 1915 | 1 | 1 | 1 | Yes | No | |
3 | 2218 | 4083 | 1865 | 1 | 1 | 1 | Yes | No |
As you can see roughly only half of the source messages are in the target topic based on size column. What am I doing wrong?