1

Is it possible to have a DeadLetter Queue topic on Kafka Source Connector side? We have a challenge with the events processed by the IBM MQ Source connector, which is processing N number of messages but sending N-100 messages, where 100 messages are the Poison messages. But from below blog by Robin Moffatt, I can see it is not doable to have DLQ on Source Connectors side.

https://www.confluent.io/blog/kafka-connect-deep-dive-error-handling-dead-letter-queues/ Below note mentioned in above article: Note that there is no dead letter queue for source connectors.

1Q) Please confirm if anyone used the Deadletter queue for the IBM MQ Source Connector (below is the documentation) https://github.com/ibm-messaging/kafka-connect-mq-source

2Q) Is anyone used the DLQ on any other source connectors side? 3Q) Why it is a limitation on not having DLQ on source connector side?

Thanks.

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
Dachs84
  • 11
  • 4

1 Answers1

1

errors.tolerance is available for source connectors too - refer docs

However, if you compare that to sinks, no, DLQ options are not available. You would instead need to parse the Connector logs with the event details, then pipe that to a topic on your own.


Overall, how would the source connectors decide what events are bad? A network connection exception means that no messages would be read at all, so there's nothing to produce. If messages fail to serialize to Kafka events, then they also would fail to be produced... Your options are either to fail-fast, or skip and log.

If you're just wanting to send through binary data as-is, then nothing would be "poisonous" it can be done with the ByteArrayConverter class, but that's not really a good use case for Kafka Connect since it's primarily designed around Structured types with parsible Schemas, but at least with that option, data gets into Kafka and you can use Kstreams to branch/filter good messages from the bad ones

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
  • We are getting data into Kafka from 2 different Sources via Source connector. *For ex*: Assume 2 teams owns 2 sources (Source-1 & Source-2) sending data to Kafka. Source-1 sent 100 messages, Source-2 sent 100 messages. 200 messages should be there in the Kafka Topic. But only 150 messages are reaching the topic. Source-1 Team claiming their message formats are not changed and poisonous. Same claim from Source-2 Team side as well.  So on Kafka Source Connector side, need a mechanism to decide whether those missing *50 Messages* are really killer messages or not. – Dachs84 May 28 '21 at 14:12
  • To comply with the data format, we are using StringConverter (for both Key and Value) on IBM MQ Source Connector side. https://github.com/ibm-messaging/kafka-connect-mq-sink. – Dachs84 May 28 '21 at 14:12
  • If you send the data to one topic per source, should be really easy to determine which is the problem. You can also attempt to enable TRACE logging to get full view into the Connect API actions – OneCricketeer May 28 '21 at 15:55
  • Actually, we have stopped both sources and did the same (by sending one source data). Still we are seeing missing events. Can you please elaborate a bit about ByteArrayConverter that you have mentioned in your previous response? – Dachs84 May 28 '21 at 17:44
  • Everything you need to know about the converter is already written in the Readme of that project – OneCricketeer May 29 '21 at 12:25
  • Thank you. As it is not recommendable to have ByteArrayConverter Class for Connect scenario, how about having a separate topic to log the "bad" messages into a separate topic (as mentioned in the pattern #3 in below blog)? All I need a mechanism to separate those Killer messages and have the consumer to continue. And later see those messages for review. https://www.confluent.io/blog/error-handling-patterns-in-kafka/ – Dachs84 May 29 '21 at 22:46
  • Appreciate any comments if anyone used the approach to separate the bad formatted events to different topic. – Dachs84 May 30 '21 at 12:03
  • As answered, it's not possible to configure any source connector to do that, and it's still unclear how the Connector is deciding what records are "bad". If they're not read, you should expect to get errors in the Connector logs, but you've not stated what those are – OneCricketeer May 30 '21 at 13:29
  • Thanks. And sorry, I did not work on Kafka Connect extensively. Seeing that blog I am thinking we can route the bad messages to at least separate topic. I understood that there is NO way on SOURCE side. And Connector logs are the only means to find out the errors on SOURCE side. – Dachs84 Jun 01 '21 at 14:02
  • You would still need a source connector to read any data into a Topic. If one configuration doesn't work to write data, then I'm not sure I understand how you planning on writing to any other topic using the connector... Maybe an alternative solution like Apache Camel or Spring Cloud Streams could work – OneCricketeer Jun 01 '21 at 19:58
  • Thank you. Still we are still struggling to instrument a way to load the bad messages on the source side into a separate topic. I wish some solution exists like below on MongoDB side, they added dead letter queue support for the source connector: https://github.com/mongodb/mongo-kafka/commit/d68d7da45c98148bed2fef00ef2cf5ba64031d3a. Does Apache Camel connector can help us in finding the bad records? – Dachs84 Jun 02 '21 at 17:26
  • I haven't used it, but I doubt it. Regardless, even that Mongo connect actually **logs** the exception. https://github.com/mongodb/mongo-kafka/commit/d68d7da45c98148bed2fef00ef2cf5ba64031d3a#diff-cd6672a31761782e6b18f3e207148848f7950593821ef36dcad8d1872d570745R282 so, if the IBM one does not, then I suggest you submit a PR to them to at least have it do that, specifically here https://github.com/ibm-messaging/kafka-connect-mq-source/blob/master/src/main/java/com/ibm/eventstreams/connect/mqsource/JMSReader.java#L289-L296 – OneCricketeer Jun 02 '21 at 22:44
  • That is the only way I can see to reach out to IBM. I appreciate your comments. Hope this thread will be helpful for the folks who is struggling on Source side bad data issues. Thank you. – Dachs84 Jun 03 '21 at 12:43
  • | "how would the source connectors decide what events are bad" This could happen if you implement your own SMT for the source connector, and SMT fails to process the original events, which should ideally sent to a DLQ. – GC001 Feb 25 '22 at 04:46