1

I have a spring batch project that reads a huge zip file containing more than 100.000 xml files.

I am using MultiResourcePartitioner, and I have a Memory issue and my batch fails with

java.lang.OutOfMemoryError: GC overhead limit exceeded.

It seems like if all the xml files are loaded in memory and not garbaged after processing.

Is there a performant way to do this ?

Thanks.

JavaDev
  • 307
  • 1
  • 3
  • 16
  • What memory settings are you currently using? – Michael Minella Aug 09 '16 at 15:49
  • I am getting this error when using these settings : -Xms512m -Xmx1024m. When I set -Xms1024m -Xmx4096m I don't get the error but the Heap is using 2Gb, it seems to be too much for 200.000 xml files of 4Ko each. – JavaDev Aug 09 '16 at 15:54
  • With 200,000 files, do you really need/want one partition per file? You may want to consider writing your own `Partitioner` that groups files together into chunks. – Michael Minella Aug 09 '16 at 18:14
  • I process each file individually, each xml is marshalled, then processed, then written to an XML file. Can you explain further what you mean by partioner that groups files into chunks ? – JavaDev Aug 10 '16 at 07:37
  • The `MultiResourcePartitioner` creates one partition (and therefore one `ExecutionContext` and one `StepExecution` per file. With 200,000 files, you may wan to group them together so that you have less partitions. – Michael Minella Aug 10 '16 at 15:02

0 Answers0