2

We are having Active Active Kafka cluster setup with topic renaming using Mirror Maker 2.0 as specified in https://strimzi.io/blog/2020/03/30/introducing-mirrormaker2/. I believe topic such as us-email are setup as follows:

dc1

  1. us-email
  2. us-email-dc2 (mirror of dc2)

dc2

  1. us-email
  2. us-email-dc1 (mirror of dc1)

Producers can publish to their local DC's and both clusters would contain data of both the DC's. So far so good.

Consumer app would subscribe to wild card topic (us-email-*) to read data of both DC's. If that's the case, Do I setup a consumer to read from their respective DC's? In this case, there will be duplicate message read for reach message due to mirroring. OR it is recommended to point a single consumer group to a single DC only at a time to prevent duplication? If yes, if a single DC fails, how will the failover happen?

Madhur Ahuja
  • 22,211
  • 14
  • 71
  • 124
  • 1
    It is your choice to consume from either dc but you should only consume from one. In the event of failover you can use the same consumer group to start consuming from other dc. Before consuming make sure to map to correct offset in other dc using the RemoteClusterUtils – s7vr Aug 06 '20 at 01:06

2 Answers2

2

Does consumers in both data centers have to point to single DC

Consumers cannot read from more than one list of bootstrap servers, so yes

there is manual failover?

Not clear what you mean by manual.

  1. If the Mirror or destination brokers fail, then consumer stops reading anything
  2. If the source is down, then the mirroring stops, leading back to (1)

consumers in both DC's will get replicated messages as well

Mirroring doesn't guarantee exactly once delivery

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
  • "Does consumers in both data centers have to point to single DC" - my question here was that in an active active configuration will consumers point to their respective DC or a single DC? If I point to their respective DC, my consumer application will always receive 2 copies of the same message, because that message is mirrored with other DC and vice versa. Is that expected? – Madhur Ahuja Aug 01 '20 at 06:02
  • They could point at either, but you'd ideally want the one with the lower latency / less network hops. It's not possible to read the "source" topic at the destination DC because it doesn't exist, as it'll be renamed onto the destination. – OneCricketeer Aug 02 '20 at 03:39
0

Automatic failover is not possible. Whenever one dc fails, you have to update the consumer to read from other dc manually. Also about consumer offsets, I am not sure if they sync and they let you continue or treat the consumer as new consumer-group.

Suman
  • 818
  • 6
  • 17