4

I have a Java application which monitors a directory for new files and process any new file it sees in the directory. Application needs to run on both linux and windows env. The issue is on linux, when a user manually copies a file in the directory, the application picks the file before it gets completely copied over to the directory. I have tried to lock the file using various methods, but application is able to acquire lock on the file even if the file is still being copied by the linux system. I have also consider to check the file using the lsof command, but it is linux specific, so I am trying to avoid that.


Can someone please suggest a way to prevent the incomplete file from being picked up by the application? Thanks in advance.

Prashant
  • 152
  • 1
  • 16
  • one approach to this problem is to always copy two files into the target folder. First you copy the actual data file into the target folder, lets call that file "fileName.dat". Once this file copy has completed successsfully you then create a dummy file called "fileName.done" within the target folder. Your application detects files that end with ".done" and it knows theres a "fileName.dat" that is ready to read. – Hector Dec 14 '16 at 06:19
  • This would make sense if we have control over who is copying the file in the directory, unfortunately we don't know who will be copying the file and how to guide them. – Prashant Dec 14 '16 at 06:24
  • Don't use it until its size has been stable for say five minutes, or whatever you deem appropriate. – user207421 Dec 14 '16 at 07:34
  • I believe you can read from a file while it is still being written to. Why not try that approach? – Hector Dec 14 '16 at 08:14
  • @Hector Because you won't know whether end of file is temporary or permanent. – user207421 Dec 14 '16 at 08:20

3 Answers3

1

You can check file size in some interval like 2 second and if its different then file is being copied if not then you can go :)

Keval
  • 1,857
  • 16
  • 26
1

Maybe a solution is the answer from this post. You can try with jpoller.

Community
  • 1
  • 1
23ars
  • 647
  • 3
  • 16
  • 37
0

One way that you can accomplish this is, when your program detects that a new file exists in the directory, it can add that file to a list and periodically check the filesize of each element in the list with the following method:

File#length

If after a certain amount of time the filesize has not changed, then theoretically the file should be fully copied and flushed to the directory.

Jacob G.
  • 28,856
  • 5
  • 62
  • 116
  • Thanks for the response, I have also considered this, but it does not seem to be a very clean solution. Do you have any other suggestion? – Prashant Dec 14 '16 at 06:16
  • Another possible solution would be to have your program recursively search for the origin file and check its filesize with the copied file. You would only have to search it once and then you can cache its absolute path somewhere in your program. – Jacob G. Dec 14 '16 at 06:18