-2

I am looking for advice on how the following problem is handled in solutions that use Pulsar/kafka.

The scenario is: A producer is sending messages (in JSON) and the consumer is taking them and inserting data into database tables (specified in the message).

Suddenly the producer changes the structure of the data sent in the messages (let's say because the table structure in the database has a new column).

So for a moment, the queue has messages with the old data structure and now starts receiving messages with new data structure.

My doubt is regarding how the consumer should handle this scenario. What to do with the messages with old structure that are now invalid since they cannot be inserted in the database table since the table structure changed. Retry and then permanently fail (dead letter Q?).

Also, do you usually opt to sent the metadata along with your messages or do you normally handle this in a separate topic or other form.

Thanks for any advice

Tiago Alcobia
  • 99
  • 1
  • 5
  • Not seeing how this is unique to Pulsar... Kafka Connect's JDBC sink handles this with message schemas. – OneCricketeer Oct 30 '20 at 23:36
  • Please, you should clarify if you question is related to pulsar or kafka, but basically for both you should have a look to Schema Registry and Avro. – fhussonnois Oct 31 '20 at 14:56
  • I tagged this post for Kafka too because as pointed out it seems to be also applicable there and I'm look forward to hear from both communities. Thank you both for the answers. – Tiago Alcobia Oct 31 '20 at 18:17

1 Answers1

0

The described issue is primarily related to an external system, which would require some sort of gate-keeping/pre-validation at the producer end that's aware of how the data will be consumed for preventing that. Unfortunately, that introduces tight coupling, so without that, you'd have to explicitly write the consumer code to have robust message conversion and exception handling, possibly including a sort of version number or explicit schema with each message like the Confluent Schema Registry provides (maybe also the Schema Registry feature of Pulsar)

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245