2

I have planned to used Amazon MSK and i want to dump consumer logs to S3 . But i don't see any options. Do i need to write my own consumer or is there a way to consume Amazon MSK consumer output to s3 directly ?

Robin Moffatt
  • 30,382
  • 3
  • 65
  • 92
Navin Kumar
  • 150
  • 2
  • 10

2 Answers2

5

Kafka Connect is generally the best (easiest/scalable/portable/resilient) way to get data between Kafka and systems down (and up) stream such as S3. Learn more about Kafka Connect here and in this talk here.

MSK Connect can run Kafka Connect workloads for your MSK on AWS.

Another option you have is to run your own Kafka Connect worker (which connects to MSK) and use the S3 sink connector (tutorial).

Robin Moffatt
  • 30,382
  • 3
  • 65
  • 92
  • AWS has released MSK Connect recently (managed Apache Kafka Connect), so I've added it to the answer as an option. – ivamax9 Oct 13 '21 at 23:19
3

There is not a direct way to do it from MSK. You can use an external consumer to do it or preferably use KafkaConnect in an EC2 within the same VPC as MSK.

Either way you need to consider for high availability and data transfer costs. For HA, use consumers in different AZs. For costs, use MSK 2.4.1 that allows consumers to fetch data from the closest replica.

herbertgoto
  • 339
  • 1
  • 5
  • What do you mean by using consumers in different AZs? How does it translate into Kafka Connect? – idetyp Aug 02 '21 at 10:19