9

We're using MirrorMaker2 to replicate some topics from one kerberized kafka cluster to another kafka cluster (strictly unidirectional). We don't control the source kafka cluster and we're given only access to describe and read specific topics that are to be consumed.

MirrorMaker2 creates and maintains a topic (mm2-offset-syncs) in the source cluster to encode cluster-to-cluster offset mappings for each topic-partition being replicated and also creates an AdminClient in the source cluster to handle ACL/Config propagation. Because MM2 needs authorization to create and write to these topics in the source cluster, or to perform operations through AdminClient, I'm trying to understand why/if we need these mechanisms in our scenario.

My question is:

  1. In a strictly unidirectional scenario, what is the usefulness of this source-cluster offset-sync topic to Mirrormaker?
  2. If indeed it's superfluous, is it possible to disable it or operate mm2 without access to create/produce to this topic?
  3. If ACL and Config propagation is disabled, is it safe to assume that the AdminClient is not used for anything else?

In the MirrorMaker code, the offset-sync topic it is readily created by MirrorSourceConnector when it starts and then maintained by the MirrorSourceTask. The same happens to AdminClient in the MirrorSourceConnector.

I have found no way to toggle off these features but honestly I might be missing something in my line of thought.

Alexandre Juma
  • 3,128
  • 1
  • 20
  • 46

1 Answers1

4

There is an option inroduced in Kafka 3.0 to make MM2 not to create the mm2-offset-syncs topic in the source cluster and operate on it in the target cluster.

Thanks to the KIP-716: https://cwiki.apache.org/confluence/display/KAFKA/KIP-716%3A+Allow+configuring+the+location+of+the+offset-syncs+topic+with+MirrorMaker2

Pull-request:

Tim Berglund noted this KIP-716 in Kafka 3.0 release: https://www.youtube.com/watch?v=7SDwWFYnhGA&t=462s

So, to make MM2 to operate on the mm2-offset-syncs topic in the target cluster you should:

  1. set option src->dst.offset-syncs.topic.location = target
  2. manually create mm2-offset-syncs.dst.internal topic in the target cluster
  3. start MM2

src and dst - are examples of aliases, replace it with yours.

Keep in mind: if mm2-offset-syncs.dst.internal topic is not created manually in the target cluster, then MM2 still tries to create this topic in the source cluster.

In case of one-direction replication process this topic is useless, because it is empty all the time, but MM2 requires it anyway.

Guram Savinov
  • 594
  • 5
  • 10
  • This KIP seems to address one part of the question. Regarding the need of the AdminClient connection to the source cluster in case topic config and ACLs propagation is disabled, is there any planned change or workaround to avoid the creation of the AdminClient? – Alexandre Juma Jan 24 '22 at 15:37
  • 1
    @Guram Savinov: Is there any solution for 2.x kafka versions? or just in 3.x versions? – Ruchir Vani Jan 27 '22 at 23:53
  • @AlexandreJuma I think AdminClient creation is not a problem: on the source cluster topics and groups read ACL is enough for successful replication – Guram Savinov Feb 01 '22 at 10:55
  • @RuchirVani I didn't find any solutions for kafka 2.x, sources customization is the only solution that comes into my mind – Guram Savinov Feb 01 '22 at 10:59