I have two machines with different Java applications that both run on Linux and use a common Windows share
folder. One app is triggering another to generate a specific file (e.g. image/pdf). Then the first app tries to upload the generated file to S3. The problem is I sometimes get this:
com.amazonaws.services.s3.model.AmazonS3Exception: The Content-MD5 you specified did not match what we received.
OR this:
com.amazonaws.AmazonClientException: Data read has a different length than the expected: dataLength=247898; expectedLength=262062; includeSkipped=false; in.getClass()=class com.amazonaws.internal.ResettableInputStream; markedSupported=true; marked=0; resetSinceLastMarked=false; markCount=1; resetCount=0
All the processes are happening synchronously, one after another (i have also checked the logs which show no concurrent activity). Also I am not setting the md5 hash or the content length by myself, aws-sdk handles it by itself.
So my guess is that the generating application has written a file and returned but in fact it is still being written by the OS in background and that is why the first app is getting an incomplete file.
I would really appreciate suggestions on how to handle such situations. Maybe there is a way to detect if the file is not currently being modified by the OS?