2

I am copying large (600mb) files from sftp, doing additional processing on some of them, and storing them on a network location. My question is, how do i do this efficiently?

This would be the most straightforward way:

// just copy
from(sftp://user@domain.nl:22/folder)
  .to(file:////mnt/networkfolder/copy)

// do some processing
from(sftp://user@domain.nl:22/folder?include=important.*)
  .to(MyBean)
  .to(file:////mnt/networkfolder/processed)

However, this makes two trips to sftp. I could do this:

// just copy
from(sftp://user@domain.nl:22/folder)
  .to(file:////mnt/networkfolder/copy)

// do some processing
from(file:////mnt/networkfolder/copy?include=important.*)
  .to(MyBean)
  .to(file:////mnt/networkfolder/processed)

But then i move the files from the network back to the server for processing (or at least i think that is what will happen.)

So my hunch is this would be the best way:

// get files to my server
from(sftp://user@domain.nl:22/folder)
  .to(file:////workdirectory)

// do the processing, then move to network
from(file:////workdirectory?include=important.*)
  .to(MyBean)
  .to(file:////mnt/networkfolder/processed)

// copy
from(file:////workdirectory?delete=true)
  .to(file:////mnt/networkfolder/copy)

Will the two routes starting in the workdirectory interfere with each other? Is this the best way and does it matter in terms of network traffic and speed?

Ivana
  • 643
  • 10
  • 27
  • Use one consumer and pass that to a seda queue. The seda queue can have multiple consumers thus you will pick up a file send it to seda which will do processing in parrallel. – Namphibian Dec 21 '18 at 00:16

0 Answers0