2

Our Spring Web Application uses Spring Batch with Quartz to carry out complex jobs. Most of these jobs run in the scope of a transaction because if one part of the complex system fails we want any previous database works to be rolled back. We would then investigate the problem, deploy a fix, and restart the servers.

It's getting to be an issue because some of these jobs do a HUGE amount of processing and can take a long time to run. As execution time starts to surpass the 1 hour mark, we find ourselves unable to deploy fixes to production for other problems because we don't want to interrupt a vital job.

I have been reading up on the Reactor implementation as a solution to our problems. We can do a small bit of processing, publish an event, and have other systems do the appropriate action as needed. Sweet!

The only question I have is, what is the best way to handle failure? If I publish an event and a Consumer fails to conduct some critical functionality, will it restart at a later time?

What if an event is published, and before all the appropriate consumers that listen for it can handle it appropriately, the server shuts down for a deployment?

Simon Baslé
  • 27,105
  • 5
  • 69
  • 70
IcedDante
  • 6,145
  • 12
  • 57
  • 100

2 Answers2

7

I just started to use reactor recently so I may have some misconception about it, however I'll try to answer you.

Reactor is a library which helps you to develop non-blocking code with back-pressure support which may help you to scale your application without consuming a lot of resources.

The fluent style of reactor can easily replace Spring Batch however the reactor by itself doesn't provide any way to handle transaction nor Spring and in case the jdbc current implementation it will be always blocking since there's no support in the drive level to non-blocking processing. There are discussions around how to handle transactions anyway but as far as know there's no final decision about this matter.

You can always use transactions but remember that you are not going to have non-blocking processing since you need to update/delete/insert/commit in the same thread or manually propagate the transactional context to the new thread and block the main thread

So I believe Reactor won't help you solve your performance issues and another kind of approach may take place.

My recommendation is:

 - Use parallel processing in Spring Batch  - Find the optimal chunk number  - Review your indexes (not just create but delete it)  - Review your queries  - Avoid unneeded transformations  - And even more important: Profile it! the bottleneck can be something that you have no idea

Felipe Rotilho
  • 106
  • 1
  • 5
  • 1
    In order to update the answer to evolution of technologies after it was written, now Spring 5 supports also transactions for reactive paradigm (from Spring 5.2): https://spring.io/blog/2019/05/16/reactive-transactions-with-spring – Jordi Alvarez Apr 22 '22 at 13:35
0

Spring batch allows you to break large transactions up into multiple smaller transactions when you use chunk oriented processing. If a chunk fails, then it's transaction rolls back but all previous chunks transactions would have committed. By default, when you restart the job, it will start again from where it failed, so if it had already processed 99 chunks successfully and the 100th chunk failed, restarting the job starts from the 100th chunk and continues.

If you have a long running job and want to deploy a new version, you can stop the job and it will stop after processing the current chunk. You can then restart the job from where it was stopped. It helps to have a GUI to view, launch, stop and restart your jobs. You can use spring batch admin or spring cloud dataflow for that if you want an out of the box GUI.

httPants
  • 1,832
  • 1
  • 11
  • 13