0

I have a situation where we're trying to process around 200 files by picking them up from an sftp 'in' folder, processing them, and then moving them to another 'out' folder after the processing is completed.

However the files are being moved by wso2 directly to the 'out' folder without being processed. This happens even though it's processing all the files one by one and not all at a time. We even tried adding a file process interval between each file, but still the issue remains. For our setup in prod we have deployed our car using 2 pods in Kubernetes (we even tried with a single pod to no avail).

Edit

Note: This issue happens when the files are put in smb but NOT in local folder. Locally, this integration works as it's supposed to.

Here's how we're defining the proxy for moving the files:

 <proxy name="file_read" startOnLoad="true" transports="http https" xmlns="http://ws.apache.org/ns/synapse">
<target>
    <inSequence>
        <property expression="get-property('transport', 'FILE_NAME')" name="INPUT_FILE_NAME" scope="default" type="STRING"/>
        <log>
            <property expression="$ctx:INPUT_FILE_NAME" name="Input-filename"/>
        </log>
        <respond/>
    </inSequence>
    <outSequence/>
    <faultSequence/>
</target>
<parameter name="transport.vfs.Streaming">true</parameter>
<parameter name="transport.PollInterval">60</parameter>
<parameter name="transport.vfs.MaxRetryCount">1</parameter>
<parameter name="transport.vfs.FileURI">sftp://folder/in?sftpPathFromRoot=true&amp;transport.vfs.AvoidPermissionCheck=true</parameter>
<parameter name="transport.vfs.ContentType">text/plain</parameter>
<parameter name="transport.vfs.FileProcessInterval">30000</parameter>
<parameter name="transport.vfs.ActionAfterProcess">MOVE</parameter>
<parameter name="transport.vfs.MoveAfterFailure">sftp:///folder/error?sftpPathFromRoot=true&amp;transport.vfs.AvoidPermissionCheck=true</parameter>
<parameter name="transport.vfs.ActionAfterFailure">MOVE</parameter>
<parameter name="transport.vfs.FileNamePattern">.*.csv</parameter>
<parameter name="transport.vfs.MoveTimestampFormat">yyyy-MM-dd'T'HH:mm:ss_</parameter>
<parameter name="transport.vfs.MoveAfterProcess">sftp://folder/out?sftpPathFromRoot=true&amp;transport.vfs.AvoidPermissionCheck=true</parameter></proxy>

The Input-filename log prints for each file that is picked up from the 'in' folder, however for those files that are skipped (directly moved to the out folder) the log is not printed.

I've posted about this before. What might be happening?

halfer
  • 19,824
  • 17
  • 99
  • 186
ahinsa
  • 61
  • 6
  • 2
    What do you mean by without being processed? What do you do with the files after reading? – ycr Jul 25 '23 at 16:57
  • By 'processed', I mean reading the data in the files by my proxy, and then after reading each file we move it from the 'in' folder to the 'out' folder. Actually we are using file poll interval, but what's happening is most of the files are being skipped and without the data being read are being moved to history. – ahinsa Jul 25 '23 at 17:10
  • (by history I mean the 'out' folder) – ahinsa Jul 25 '23 at 17:16
  • How can anybody answer your question if you just say "It's not working"? You need to add more details to the questions. Try to reproduce the issue with less number of files and then share a minimal reproducible code sample with the logs you see. – ycr Jul 25 '23 at 22:54
  • I understand, I've edited the original question with the issue reproduced for a smaller number of files. – ahinsa Jul 26 '23 at 08:42
  • 1
    If you don't see the logs probably it was never read by WSO2. Try enabling debug logs for the package `org.apache.synapse.transport.vfs` and see whether you can find any additional information. Other option is to use a File Inbound instead of the a proxy to read the files. https://ei.docs.wso2.com/en/7.0.0/micro-integrator/use-cases/examples/inbound_endpoint_examples/file-inbound-endpoint/ – ycr Jul 26 '23 at 12:03

1 Answers1

0

In the transport.vfs.MoveAfterProcess directory, WSO2 is moving the original files. Not processed - in meaning changed/modified. If you want to store processed: changed modified, you should set in outSequence something like that:

<outSequence>
   <property name="transport.vfs.ReplyFileName"
             expression="fn:concat(fn:substring-after(get-property('MessageID'), 'urn:uuid:'), '.xml')"
             scope="transport"/>
   <property action="set" name="OUT_ONLY" value="true"/>
   <send>
      <endpoint>
         <address uri="vfs:file://D:/integration_name/processed"/> 
      </endpoint>
   </send>
</outSequence>
tmoasz
  • 1,244
  • 2
  • 10
  • 15
  • Hi thanks for your ans, but actually we aren't trying to modify the original files in any way; by 'processing' our proxy is just reading the data in each file in that 'in' folder and then after reading, moving it to the 'out' folder. But in our case most of the files are being skipped and without the data being read are being moved to the 'out' folder. – ahinsa Jul 25 '23 at 17:17
  • And what is the "definition" of read in your case? What your proxy logic is doing? Can you add more datails? – tmoasz Jul 26 '23 at 06:57
  • Ok i've now edited the original question with the proxy logic, hope that it's more clear – ahinsa Jul 26 '23 at 08:45