I am copying large (600mb) files from sftp, doing additional processing on some of them, and storing them on a network location. My question is, how do i do this efficiently?
This would be the most straightforward way:
// just copy
from(sftp://user@domain.nl:22/folder)
.to(file:////mnt/networkfolder/copy)
// do some processing
from(sftp://user@domain.nl:22/folder?include=important.*)
.to(MyBean)
.to(file:////mnt/networkfolder/processed)
However, this makes two trips to sftp. I could do this:
// just copy
from(sftp://user@domain.nl:22/folder)
.to(file:////mnt/networkfolder/copy)
// do some processing
from(file:////mnt/networkfolder/copy?include=important.*)
.to(MyBean)
.to(file:////mnt/networkfolder/processed)
But then i move the files from the network back to the server for processing (or at least i think that is what will happen.)
So my hunch is this would be the best way:
// get files to my server
from(sftp://user@domain.nl:22/folder)
.to(file:////workdirectory)
// do the processing, then move to network
from(file:////workdirectory?include=important.*)
.to(MyBean)
.to(file:////mnt/networkfolder/processed)
// copy
from(file:////workdirectory?delete=true)
.to(file:////mnt/networkfolder/copy)
Will the two routes starting in the workdirectory interfere with each other? Is this the best way and does it matter in terms of network traffic and speed?