1

I am using camel technology for my file operation. My system is cluster environment.

Let say,  I have 4 instances  
Instance A 
Instance B 
Instance C 
Instance D

Folders Structure

Input Folder: C:/app/input

Output Folder: C:/app/output

All the four instances will be pointing to Input folder location. As per, my business 8 files will be placed in the input folder and output will be consolidated file. here camel losing data when concurrently writing to output file.

Route:

 from("file://C:/app/input")
 .setHeader(Exchange.File_Name,simple("output.txt"))
 .to("file://C:/app/output?fileExist=Append")
 .end();

Kindly help me to resolve this issue. is there any thing like write lock in camel? to avoid concurrent file writer. Thanks in advance

Pyare
  • 671
  • 2
  • 11
  • 32
  • As per my understanding these input folder should be in each of those instances individually and the output folder is the same as well ? – Naveen Raj May 31 '15 at 17:21
  • No input and output is mounted location. all the 4 instance share the same location. – Pyare May 31 '15 at 17:22
  • As I understand that when we place 8 files in the location each of the 4 instances will start processing . Thus say file 1 will be processed by instance 1 and file 2 by instance 2 and so on. When this is the condition each of the instance will process the files individually and will try to update the same file name and the first one to access locks it. This causes the other instance to error out of just not to update the file. Am I right ? – Naveen Raj May 31 '15 at 17:35
  • i am not sure camel using lock when write files. i am newbie to camel. but every time i am getting different count of records. let say each file has 1000 records, then actual output to be 8000 in final output file. but i am getting different count of output for each run. – Pyare May 31 '15 at 17:45
  • In this case it's good to use the aggregator2 component, as you are reading several files from a directory and creating one file output. Use the aggregate and the completionPredicate for this. Refer :http://stackoverflow.com/questions/28339303/how-camel-2-11-batch-aggregation-works-with-separate-route – Ashoka Jun 01 '15 at 10:22
  • The problem seem to be a 2 Phase Commit issue. It seems like when camel is writing to the output file, it reads it to memory, append to it and then write it back on the disk. at the same time another consumer has done the same and does not know about the first route hence your loss of data. – Ramin Arabbagheri Jun 02 '15 at 10:01
  • yes correct Ramin Arabbagheri. how can we resolve this? bcoz instance not aware of another instace writing the file right? – Pyare Jun 02 '15 at 16:22

1 Answers1

0

You can use the doneFile option of the file component, see http://camel.apache.org/file2.html for more information.

Avoid reading files currently being written by another application

Beware the JDK File IO API is a bit limited in detecting whether another application is currently writing/copying a file. And the implementation can be different depending on OS platform as well. This could lead to that Camel thinks the file is not locked by another process and start consuming it. Therefore you have to do you own investigation what suites your environment. To help with this Camel provides different readLock options and doneFileName option that you can use. See also the section Consuming files from folders where others drop files directly.

soilworker
  • 1,317
  • 1
  • 15
  • 32