3

Why is there a checkpointInfo method in Jsr 352 ItemWriter interface. How do the reader and writer communicate on what is being committed

  • First, there's no need for the writer to make use of the **checkpointInfo()**. You can simply choose to ignore the checkpoint value in **open()** and return `null` (as the impl in **AbstractItemWriter** does). This is common. You might use a checkpoint when writing to a flat file, say, to keep track of a character position. Since file writes aren't typically transactional, you need to account for this somehow. I believe SpringBatch has some samples maybe someone will point to. Besides the lack of a sample, does that answer your question? – Scott Kurz Aug 05 '17 at 01:29
  • @ScottKurz Thanks Scott !! Yes in my case i do have a flat file as output. As a follow-up question, if the chunk size is 100 and if the process fails at #550 , on restart , my reader will have the checkpoint as 500. But in my writer I need to make sure I have it as 550 so that I don't write again the previously processed 500-550. This would be a scenario where my writer checkpoint can be used. Would you agree ? – user8341239 Aug 05 '17 at 01:48
  • @ScottKurz Thanks !! – user8341239 Aug 08 '17 at 12:47
  • My last comment was confusing, since the writeItems() only happens at the very end of the chunk. So I deleted it and wrote up my whole response as an answer below. – Scott Kurz Aug 08 '17 at 14:26

1 Answers1

1

There's no need for the writer to make use of the checkpointInfo() method. You can simply choose to ignore the checkpoint value in open() and return null (as AbstractItemWriter does). This is common. You don't need any type of "cursor" or index to write to a database, typically, since you just insert/update whatever the reader/processor give you (based on the reader checkpoint, etc.).

You might use a checkpoint when writing to a flat file. Since file writes aren't typically transactional, you need to account for this somehow.

One simple approach would be to checkpoint the byte#/offset into the file, at the end of the most recent chunk. So if the chunk transaction rolls back after you've written records 501-600 into the file, say, then on restart you'll re-read and re-process records 501-600. Even though records 501-600 are already there in the file, you'll now overwrite them, since you're (re)starting at the byte position after record 500.

Since restart shouldn't be needed too often, and you only have one chunk's worth to re-process, this can provide an easy, acceptable way around the lack of a transactional resource.

Scott Kurz
  • 4,985
  • 1
  • 18
  • 40