3

For our project we are using Azure File Storage, in which large files (at most 500 MB) can be uploaded and must be processed by Java microservices (based on Spring Boot), by using Azure SDK for Java, that periodically polls the directory to see if new files have been uploaded. Is it possible, in some ways, to determine when the uploaded file is completely uploaded, without the obvious solutions like monitoring the size?

apetrelli
  • 718
  • 4
  • 18

1 Answers1

3

Unfortunately it is not directly possible to monitor when a file upload has been completed (including monitoring the size). This is because the file upload happens in two stages:

  1. First, an empty file of certain size is created. This maps to Create File REST API operation.
  2. Next, content is written to that file. This maps to Put Range REST API operation. This is where the actual data is written to the file.

Assuming data is written to the file in sequential order (i.e. from byte 0 to file size), one possibility would be to keep on checking last "n" number of bytes of the file and see if all of them are non-zero bytes. That would indicate some data has been written at the end of the file. Again, this is not a fool-proof solution as there may be a case where last "n" bytes are genuinely zero.

Gaurav Mantri
  • 128,066
  • 12
  • 206
  • 241
  • Thanks, this was what I supposed initially. For the moment I upvote your solution, I will accept it later this week. – apetrelli May 05 '20 at 10:15
  • You're welcome. No rush in accepting the answer :). Someone might come up with a better solution. – Gaurav Mantri May 05 '20 at 10:18
  • @apetrelli How to know if the file is not being written by an REST API at that point? Could we check any sort of lock status? One way would be to check if size and lastModifiedAt are not chnaged since 3-5 secs, not sure if it's correct all the time. – Gautam Kumar Samal Jun 10 '21 at 17:40
  • I see what you meant by checking the last "n" bytes for zero. Is there any way to do that without streaming the whole content, something like ReadRange(start, end)? – Gautam Kumar Samal Jun 10 '21 at 19:12
  • @GautamKumarSamal the system is transparent to us. In the end, however, we noticed that the system transfers the file using a random name and renamed to its final form only after all the file has been transferred, so we used this mechanism to understand when the file is complete. However this behaviour is bound to the specific uploader and it is not generic. – apetrelli Jun 15 '21 at 15:25