3

I have a job that processes items in chunks (of 1000). The items are marshalled into a single JSON payload and posted to a remote service as a batch (all 1000 in one HTTP POST). Sometime the remote service bogs down and the connection times out. I set up skip for this

    return steps.get("sendData")
            .<DataRecord, DataRecord> chunk(1000)
            .reader(reader())
            .processor(processor())
            .writer(writer())
            .faultTolerant()
            .skipLimit(10)
            .skip(IOException.class)
            .build();

If a chunk fails, batch retries the chunk, but one item at a time (in order to find out which item caused the failure) but in my case no one item caused the failure, it is the case that the entire chunk succeeeds or fails as a chunk and should be retried as a chunk (in fact, dropping to single-item mode causes the remote service to get very angry and it refuses to accept the data. We do not control the remote service).

What's my best way out of this? I was trying to see if I could disable single-item retry mode, but I don't even fully understand where this happens. Is there a custom SkipPolicy or something that I can implement? (the methods there didn't look that helpful)

Or is there some way to have the item reader read the 1000 records but pass it to the writer as a List (1000 input items => one output item)?

Ken DeLong
  • 929
  • 2
  • 8
  • 27
  • What I ended up doing was to turn off all the Spring Batch fault-tolerance, and build a catch-and-retry in my custom ItemWriter. – Ken DeLong May 24 '18 at 21:37

1 Answers1

1

Let me walk though this in two parts. First I'll explain why it works the way it does, then I'll propose an option for addressing your issue.

Why Is Retry Item By Item

In your configuration, you've specified that it be fault tolerant. With that, when an exception is thrown in the ItemWriter, we don't know which item caused it so we don't have a way to skip/retry it. That's why, when we do begin the skip/retry logic, we go item by item.

How To Handle Retry By The Chunk

What this comes down to is you need to get to a chunk size of 1 in order for this to work. What that means is that instead of relying on Spring Batch for iterating over the items within a chunk for the ItemProcessor, you'll have to do it yourself. So your ItemReader would return a List<DataRecord> and your ItemProcessor would loop over that list. Your ItemWriter would take a List<List<DataRecord>>. I'd recommend creating a decorator for an ItemWriter that unwraps the outer list before passing it to the main ItemWriter.

This does remove the ability to do true skipping of a single item within that list but it sounds like that's ok for your use case.

Michael Minella
  • 20,843
  • 4
  • 55
  • 67
  • So you mean the `ItemReader` would return a `List` with only 1 item, and the `ItemProcessor` would batch them up until it got 1000 of them, before emitting a `List>`? – Ken DeLong May 23 '18 at 21:19