0

The documentation is not very clear about the role of PlatformTransactionManager in steps configuration.

First, stepBuilder.tasklet and stepBuilder.chunk requires a PlatformTransactionManager as second parameter while the migration guide says it is now required to manually configure the transaction manager on any tasklet step definition (...) This is only required for tasklet steps, other step types do not require a transaction manager by design..

More over, in the documentation the transactionManager is injected via a method parameter:

/**
 * Note the TransactionManager is typically autowired in and not needed to be explicitly
 * configured
 */

But the transactionManager created by Spring Boot is linked to the DataSource created by Spring Boot based on spring.datasource.url. So with autoconfiguration, the following beans works together: dataSource, platformTransactionManager, jobRepository. It makes sense for job and step executions metadata management.

But unless readers, writers and tasklet works with this default DataSource used by JobOperator, the auto configured transactionManager must not be used for the steps configuration. Am I right ?

Tasklets or a chunk oriented steps will often need another PlatformTransactionManager:

  • if a step writes data in a specific db it needs a specific DataSource (not necessarily declared as bean otherwise the JobRepository will use it) and a specific PlatformTransactionManager linked to this DataSource
  • if a step writes data in a file or send message to a MOM, the ResourcelessTransactionManager is more appropriate. This useful implementation is not mentioned in the documentation.

As far as I understand, the implementation of PlatformTransactionManager for a step depends on where the data are written and has nothing to do with the transactionManager bean used by the JobOperator Am I right ?

Example:

var builder = new StepBuilder("step-1", jobRepository);
PlatformTransactionManager txManager = new ResourcelessTransactionManager();
return builder.<Input, Output> chunk(10, txManager)
    .reader(reader())
    .processor(processor())
    .writer(writer()/*a FlatFileItemWriter*/)
    .build();

or

@SpringBootApplication
@EnableBatchProcessing
public class MyJobConfiguration {

    private DataSource dsForStep1Writer;
    
    public MyJobConfiguration(@Value("${ds.for.step1.writer.url"} String url) {
        this.dsForStep1Writer = new DriverManagerDataSource(url);
    }

    // reader() method, processor() method

    JdbcBatchItemWriter<Output> writer() {
        return new JdbcBatchItemWriterBuilder<Output>()
            .dataSource(this.dsForStep1Writer)
            .sql("...")
            .itemPreparedStatementSetter((item, ps)->{/*code*/})
            .build();
    }

    @Bean
    Step step1(JobRepository jobRepository) {
        var builder = new StepBuilder("step-1", jobRepository);
        var txManager = new JdbcTransactionManager(this.dsForStep1Writer);
        return builder.<Input, Output> chunk(10, txManager)
            .reader(reader())
            .processor(processor())
            .writer(writer())
            .build();
    }
        // other methods
}

Is that correct ?

Gengis
  • 31
  • 1
  • 2

2 Answers2

0

Role of PlatformTransactionManager with Spring batch 5

The role of the transaction manager did not change between v4 and v5. I wrote an answer about this a couple of years ago for v4, so I will update it for v5 here:

In Spring Batch, there are two places where a transaction manager is used:

  • In the proxies created around the JobRepository/JobExplorer to create transactional methods when interacting with the job repository/explorer
  • In each step definition to drive the step's transaction

Typically, the same transaction manager is used in both places, but this is not a requirement. It is perfectly fine to use a ResourcelessTransactionManager with the job repository to not store any meta-data and a JpaTransactionManager in the step to persist data in a database.

Now in v5, @EnableBatchProcessing does not register a transaction manager bean in the application context anymore. You either need to manually configure one in the application context, or use the one auto-configured by Spring Boot (if you are a Spring Boot user).

What @EnableBatchProcessing will do though is look for a bean named transactionManager in the application context and set it on the auto-configured JobRepository and JobExplorer beans (this is configurable with the transactionManagerRef attribute). Again, this transaction manager bean could be manually configured or auto-configured by Boot.

Once that in place, it is up to you to set that transaction manager on your steps or not.

Mahmoud Ben Hassine
  • 28,519
  • 3
  • 32
  • 50
0

As far as I understand, yes there can be different transaction manager for the JobOperator (saving the job's state) and for the step itself (what you program).

If having problem that data changes on Spring-data repositories are not saved in the database, then the second transaction manager should be a JpaTransactionManager, as answered in https://stackoverflow.com/a/65517607/7251133.

Florian H.
  • 143
  • 9