1

I am writing a program to duplicate what user do in one computer, and repeat in another computer. Basically a sync program between computers.

I have succesfully copied create and delete event, even partially modified files, but I have a little problem with files that moves between folders.

Not every folders are under my watcher, and my program can monitor several folders. Therefore I made a combined log for all operations from many different watchers. But for those under my watcher, they shows as created and deleted events on my log. My current method is delay all the process, and transfer the log in a batch.

So far it runs good, but I keep thinking, that this method is not reliable to detect moves. for example, if a file moved, it will generate delete + create event. What if when the time due for another update, and one of those 2 events are not captured? for example, the watcher registers a delete event, then before it register create event, the log get shipped? That way, the other computer will delete a file, then next due operation, it will have to download the file again because of late create event. It just has not happenned yet because my computer is only a test environment. But real life PC will have multiple files move and created at once. A delete event is easy, but a late create event will make a problem, especially with big files.

How many miliseconds is a safe threshold to wait for the second event? Or better yet, is there any other good method on how to detect these 2 events as whole reliably, before it transmit the log?

UPDATE: Let me make something clear.. The log shows as these example:

create file A
delete file B
create file B
delete file C
-> log transmitted
create file C
create file D
modify file E
-> log transmitted

In this example, server would do sync twice. Now, file B won't have a problem since my program have detected B fingerprint on both operation at first log, and move file B accordingly, but file C will have to be retransmitted. My problem is, is there a way on how to prevent C deleted while waiting for create log?

UPDATE:

If I send IO operation log to remote computer in parallel, this would happen:

create file A
-> log transmitted
delete file B
-> log transmitted
create file B
-> log transmitted
delete file C
-> log transmitted
create file C
-> log transmitted
create file D
-> log transmitted
modify file E
-> log transmitted

Then file B and C would have been deleted when watcher send create operation.

And please, as you should have probably noticed, I am NOT asking how to code watcher. I have made my watcher, and it is WORKING. Albeit sometime high cost for unfortunate move operation. I am looking for a way to know if that a delete operation would or would not have another create operation follows.

Thank you

Magician
  • 1,944
  • 6
  • 24
  • 38
  • 1
    There isn't a portable way to do this, but look at http://stackoverflow.com/questions/7092081/java-file-renaming-detection – teppic Nov 20 '16 at 20:20
  • Yes. I gather that.thank you. Is there a way to detect these moves, because I want to avoid deleting something that moves. I am currently using delay operation method and wait for some time until log get shipped but that doesn't guarantee that those move logs are transferred in pair. Small files will pose no problem, but moving large files such as videos could potentially waste unnecessary bandwidth. – Magician Nov 21 '16 at 04:27
  • and we can't detect which one is single independent action, and which one is in pair until the other operation logged. – Magician Nov 21 '16 at 04:30
  • I haven't tried it, but the jpathwatch lib mentioned in the link provides rename events. You could just ship the event. – teppic Nov 21 '16 at 04:31
  • ok.. let me try jpathwatch – Magician Nov 21 '16 at 04:38
  • If it doesn't work you could send a hash with the event. Expensive on the sever, but cheap on the client. – teppic Nov 21 '16 at 04:43
  • I have hash on each program on chunk basis much like torrent. But hash are not maintained for create/delete operation. It used primarily to transfer partial data of a chunk. The problem here is that when file moved and server sent "unfinished" log. I have algorithm to detect delete event when it finds create event with same fingerprint, and rellocate them as move event instead of create/delete. But the problem is that when its pair is missing or not yet logged. – Magician Nov 21 '16 at 07:23
  • My question is why let logging create that problem in first place? why not restrict log transmission in parallel to your program logic -or- better yet have underlying program design that doesn't require logs to determine state? – Nick Bell Nov 23 '16 at 00:39
  • Because if I send the log in parallel, every single move entries would be orphaned. I have to wait until second operation put into place before sending the log. For normal create and delete operation, this would not pose a problem. The problem only occurs when there is a pair of delete+create. Or perhaps, you can explain your idea further? I still can change the logic if needed. – Magician Nov 23 '16 at 08:51
  • You Can Visit This Link https://docs.oracle.com/javase/tutorial/essential/io/notification.html – Shankar Shastri Nov 23 '16 at 09:08
  • Yes I have, and I have finished my code from that document, and working fine. I am NOT asking how to code. I have finished my code, and WORKING. As you can see in said document, that "move" would generate TWO events. Create and Delete. Now, how can I notice which delete is delete only, which create is create only, and which delete has create follow ups before sending the log? – Magician Nov 23 '16 at 10:23

0 Answers0