Efficiently moving and processing large files between sftp, server and network with Apache Camel

Question

I am copying large (600mb) files from sftp, doing additional processing on some of them, and storing them on a network location. My question is, how do i do this efficiently?

This would be the most straightforward way:

// just copy
from(sftp://user@domain.nl:22/folder)
  .to(file:////mnt/networkfolder/copy)

// do some processing
from(sftp://user@domain.nl:22/folder?include=important.*)
  .to(MyBean)
  .to(file:////mnt/networkfolder/processed)

However, this makes two trips to sftp. I could do this:

// just copy
from(sftp://user@domain.nl:22/folder)
  .to(file:////mnt/networkfolder/copy)

// do some processing
from(file:////mnt/networkfolder/copy?include=important.*)
  .to(MyBean)
  .to(file:////mnt/networkfolder/processed)

But then i move the files from the network back to the server for processing (or at least i think that is what will happen.)

So my hunch is this would be the best way:

// get files to my server
from(sftp://user@domain.nl:22/folder)
  .to(file:////workdirectory)

// do the processing, then move to network
from(file:////workdirectory?include=important.*)
  .to(MyBean)
  .to(file:////mnt/networkfolder/processed)

// copy
from(file:////workdirectory?delete=true)
  .to(file:////mnt/networkfolder/copy)

Will the two routes starting in the workdirectory interfere with each other? Is this the best way and does it matter in terms of network traffic and speed?

Use one consumer and pass that to a seda queue. The seda queue can have multiple consumers thus you will pick up a file send it to seda which will do processing in parrallel. — Namphibian, Dec 21 '18 at 00:16

Efficiently moving and processing large files between sftp, server and network with Apache Camel

0 Answers0