1

Current Implementation

I have a spring batch job which writes to a kafka topic. I read records from the database, transform them and writes to the kafka topic.

New Changes to the existing job

I am suppose to write to one more audit topic along with the main topic.

For each record read from the database, I am writing a message of say Class Abc type to the main topic and for the same record I am suppose to write the message of another Entity Class Type to the audit topic.

Problem Statement

Currently, I am using different KakfaTemplate to write to both the topics but the issue is how to handle If job fails after writing to the main topic and it never writes to the audit topic. How to rollback the transaction (I haven't implemented transaction in the current implementation).

Do I need to change whole implementation of my application?? Shall I write to both the topics in a single transaction or there is any solution for my current implementation?

Transaction Manager

@Override
protected JobRepository createJobRepo(){
JobRepositoryFactoryBean fac = new JobRepositoryFactoryBean;
fac.setDataSource(ds);
fac.setTransactionManger(transactionManger);
fac.set();
return fac.getObject();
Sonia
  • 107
  • 2
  • 11

2 Answers2

0

Changing the implementation would make your life much easier in the long run. The problem you are describing is known as Transactional Outbox Pattern and has many well accepted implementations.

The batch job fits for a Kafka Connector (Debezium is a more sophisticated and flexible solution). The connector handels atomatically the scaling, coordination, offset handling and concurrency that you would have to implement yourself otherwise with select for update etc.

My prefered solution is to simplify the problem. and devide it in two parts.

Use a connector to write a record to a topic. Use a kafka streams application with a SMT (stateless single message transformation) with exactly-once semantic to generate a transformed message to the audit log. in this way you have a message in the adit log only if a message in the original topic was produced. The transactional complexity is out of the way.

The kafka connector (Debezium) would handle retries, failover, offsets etc.

Another older method is the Transactional Outbox for wich one can use Debezium TX-Outbox

aballaci
  • 1,043
  • 8
  • 19
0

To implement this correctly, you need to configure Spring Batch with a JTA transaction manager that coordinates a DatasourceTransactionManager (for Spring Batch' technical meta-data) and a KafkaTransactionManager (for your business data).

What is the best approach to write to two kafka topics in a single transaction in Spring batch job

If you use something like what has been suggested for your previous question here: https://stackoverflow.com/a/65287130/5019386, both writers will be executed in the same transaction driven by Spring Batch.

Mahmoud Ben Hassine
  • 28,519
  • 3
  • 32
  • 50
  • I am able to implement the code from the reference link and I am able to write to both the topics. Just to mention, I am using different KafkaTemplate for both the topics as the producerfactory properties values differs for SSL part, hope that is not a problem. So coming back to your suggestion, with my current code, I just need to configure DatasourceTransactionmanger and KafkaTransactionManger to solve my problem? – Sonia Jan 07 '21 at 13:43
  • I already have platformTransactionManger being set in JobRepositoryFactoryBean in my class annotated with @Configuration and extends DefaultBatchConfigurer. Meta data is being stored in the spring batch tables already. I dont know how it is managing the kafka writers in a single transaction and will rollback in case second writer fails. Didnt test this test case right now. Please see my post edited with code piece. – Sonia Jan 08 '21 at 09:39
  • I tried with just the transactionManager I already have and I see a case where the job is writing to the main topic but failing to write to audit topic. I still cant figure out how to rollback the transaction. – Sonia Jan 20 '21 at 14:46
  • `I tried with just the transactionManager I already have`: You did not specify which transaction manager you already have, but as I mentioned in the answer, you need to use a JTA transaction manager that coordinates the distributed transaction between kafka and the database. – Mahmoud Ben Hassine Jan 21 '21 at 07:38
  • I have DataSourceTransactionManager in my job which comes from autoconfiguration. I am trying to have two kafkaTransactionManger one for each topic out of both and then chaining them into a ChainedKafkaTransactionManager. which can be passed to the @Transactional in Writer class where we are writing to both the writers but I am getting error of having multiple transaction managers as only 1 is required. – Sonia Jan 21 '21 at 11:16