I will try to address your points with Spring Batch capabilities :
- Tool is good when schema is known
This is the case with Spring batch. You will be able to use a StaxEventItemReader
which requires an annoted bean (known schema).
- Supports Parallel execution, multiple sessions and error log
Spring batch supports Parallel execution and error logging. I'm not sure what you mean by multiple sessions. Here are some info about spring batch scalability.
- Faster, less memory and less CPU utilization
Spring batch performances depends a lot on how you will use it. Although it may not be the fastest or more efficient, it is used in many production environment across the world.
- Supports both inserts and updates
Spring Batch database writers support common DBMS with such operations (JdcbBatchItemWriter
, HibernateItemWriter
...)
- Foreign key references for target tables, dropping constraints and add after data load
I think this will need some manual implementation, but I'm not sure since I haven't met the requirement as of today.
- Eliminate duplications
This will be done in your ItemProcessor
. Here's an example : processing batch of records using spring batch before writing to DB
- block or batch load support
You can configure your writer's commit-interval
and the rollback operations with Spring Batch.
- headless execution (no-gui for schedule and start)
Spring Batch can be started with a CommandLineJobRunner
or any other way with a JobLauncher
(requiring then some manual implementation)
- Support multiple input formats
Spring Batch can read any kind of flat file (FlatFileItemReader
), xml file (StaxEventItemReader
), queue (JmsItemReader
) or database (JdbcCursorItemReader
).
- Support custom data transformation as pluggable components
Data transformation is achieved through ItemProcessor
. There are out-of-the-box implementations, but most often you will have to write you own implementation to apply your custom logic. As for pluggable components, I'm not sure what you mean.
- Transaction control, error handling and logging for future execution
Spring Batch has a whole Retry
mechanism and Restartability
. You can read more here and here.
- Inspecting the Status of the Jobs, Monitoring
Spring Batch allows you to configure where you store metadata about job status (database, file, RAM...). You will be able to read these data. There is also a second project called spring-batch-admin
which is a GUI for monitoring and control. Read more here.
- Integration testing, Sanity testing
Can't answer that.
- Scalable, how to load multiple node in parallel
See 11. Also Spring Batch can be integrated with Spring-XD.
- Restart Jobs when they crash, automatic restart after failure
See 11.
- Tracking Status and Statistics during execution
See 12.
- Ability to launch through web or Rest interfaces
Spring Batch can be integrated with Spring-Boot to answer these needs.
I hope I answered some of your concerns.