2

I'm writing a file polling implementation and am trying to determine if I need to use a AcceptOnceFileListFilter. The first step the FileProcessor will perform is to move the file to another directory.

Does the poller "batchFilePoller" use multiple threads when polling? Can a race condition occur where a file will be read by multiple threads? In this case I assume I need to use the AcceptOnceFileListFilter.

However if the poller is only using one thread from the pool. Then if the file is moved before the next poll time and it succeeds I assume there is no posability of the file been processed twice?

<int-file:inbound-channel-adapter id="batchFileInAdapter" directory="/somefolder" auto-create-directory="true" auto-startup="false" channel="batchFileInChannel" >
    <int:poller id="batchFilePoller" fixed-rate="6000" task-executor="batchTaskExecutor" max-messages-per-poll="1" error-channel="batchPollingErrorChannel" />
</int-file:inbound-channel-adapter>

<int:channel id="batchFileInChannel"/>

<int:service-activator input-channel="batchFileInChannel" >
    <bean class="com.foo.FileProcessor" />
</int:service-activator>

<task:executor id="batchTaskExecutor" pool-size="5" queue-capacity="20"/>
Michael Freeman
  • 330
  • 1
  • 4
  • 14

1 Answers1

3

The <int-file:inbound-channel-adapter> has prevent-duplicates option which is true by default and it is your case since you don't provide any other options which prevent prevent-duplicates to be true.

And yes: any polling adapter is multi-threaded, if you use fixed-rate. In this case the new polling task can be run before a finish of previous one.

Even if it will be a single-threaded (using fixed-delay), the AcceptOnceFileListFilter must be there, because a new polling task doesn't know if file has been processed or not. And it reads the same file again.

AcceptOnceFileListFilter is exactly for those cases when you don't like to read the same file one more time. You can overcome that with <int:transactional synchronization-factory=""/> for the <poller> of the <int-file:inbound-channel-adapter>:

<int:transaction-synchronization-factory id="txSyncFactory">
    <int:after-commit expression="payload.delete()"/>
</int:transaction-synchronization-factory>

and PseudoTransactionManager.

More info you can find in the Spring Integration Reference Manual.

Artem Bilan
  • 113,505
  • 11
  • 91
  • 118
  • Ok so if the first thread has not completed its work in 6 seconds another thread will be grabbed from the pool? I guess as long as the file is moved within those six seconds then it won't be read again. Its a risk but in this scenario prevent-duplicates="false" can be set. This is of course if processing duplicates is ok. – Michael Freeman Feb 04 '15 at 12:02
  • True. But pay attention, that you use `task-executor`, so one more thread won't be grabbed from the `TaskScheduler` for polling task. That's because the first `poll` shifts message to that `task-executor` and finish its work returning thread to the `TaskScheduler` pool. Again: that is described in the Reference Manual. – Artem Bilan Feb 04 '15 at 12:07
  • Thanks Artem, I have a better understanding now. I'll use a fixed-delay and allow duplicates. The file will be moved as part of the polling before handing off to another task-executor for processing. – Michael Freeman Feb 04 '15 at 13:55
  • Artem, how can you configure it to grab more threads from the taskscheduler before the task-executor that it shifted work to has finished? – Beamie Jan 27 '16 at 07:18
  • Sounds like a separate question. And I 'm not sure in it: it just does not make sense for me to worry about more thread from the TaskShceduler. – Artem Bilan Jan 27 '16 at 15:01