1

I have a scenario where i need to upload files from local disk to azure blob storage one by one and delete them on the disk once uploaded. The thing is , once a file is uploaded , I don't want to wait till that file is deleted to go and upload the next file.

I can see that there is no asynchronous file delete in .NET. So what is the best way to handle this scenario and how can i implement the same.. Currently I'm using the following code, but it seems to be unstable.

private event EventHandler FileDeleteEvent;
public async Task SendBulkTelemetryMessageConsumer()
{
  try
  {
    this.FileDeleteEvent +=this.FileDeleteEventHandler;
    // Logic to upload a file to blob storage
    await this.Log.Debug($"Deleting the file {file}");
    this.FileDeleteEvent(file);
  }
  catch()
  {
    // Exception handling
  }
}
private void FileDeleteEventHandler(string filePath)
{
        if (!File.Exists(filePath))
        {
            this.Log.Debug($"The file {filePath} doesn't exist.");
        }
        else
        {
            while (this.IsFileLocked(filePath))
            {
                Thread.Sleep(1000);
            }

            this.Log.Debug($"Deleting the file from the path {filePath}");
            File.Delete(filePath);
        }
}

private bool IsFileLocked(string filePath)
{
        try
        {
            using (File.Open(filePath, FileMode.Open))
            {
                return true;
            }
        }
        catch (IOException e)
        {
            this.Log.Error("Exception occured while deleting the file, Exception is {e}", e);
        }

        return false;
}

should i make the event handler async void or async task?

or is it more appropriate to use Fire and Forget method where i dont have to use any events and event handlers?

Harshith R
  • 418
  • 1
  • 11
  • 31
  • Who calls `SendBulkTelemetryMessageConsumer`? Where does `file` come from? Can you provide a more complete example? – asaf92 Aug 17 '21 at 08:13
  • 2
    Also don't `Thread.Sleep`, use `Task.Delay` and await it – asaf92 Aug 17 '21 at 08:13
  • 1
    Maybe this can help; https://stackoverflow.com/questions/10606328/why-isnt-there-an-asynchronous-file-delete-in-net – Joachim Isaksson Aug 17 '21 at 08:16
  • @asaf92, SendBulkTelemetryMessageConsumer method is called by one of the other internal method and the file comes from reading the file path from a config file. Anyway i have commented out that logic here since that code block (to get the files, validate them based on our business logic and to upload them) is quite large and irrelevant to the question I'm asking. – Harshith R Aug 17 '21 at 08:22
  • 1
    Since when does deleting a file take all that long? its like just a write to the allocation table. – TheGeneral Aug 17 '21 at 08:28
  • If you wanted to get fancy and don't mind wasting your time, you could use TPL dataflow and create a pipeline – TheGeneral Aug 17 '21 at 08:51

3 Answers3

3

Best you can do is to wrap deletion of a file in a task and then await it or just fire and forget (it's up to you what you want, but be sure to be be aware of consequences like loosing exceptions when not awaiting tasks):

var deletionTask = Task.Run(() => File.Delete(path));
Michał Turczyn
  • 32,028
  • 14
  • 47
  • 69
  • 4
    If you wrap IO-bound work in Task.Run() then you’re just pulling a new thread out to run the code synchronously. It may have a similar signature because it’s returning a Task, but all you’re doing is blocking a different thread. – Rafi Henig Feb 13 '22 at 02:21
1

My solution would be to have two paralell processes:

Process 2 deletes the files from the disk

Process 1 Uploads the files, then adds them to the list for process 2

Finding a proper (Threadsafe) collection for both processes is a minor issue. I would say ConcurrentQueue[T] should do it. But you could go as complex/exotic as Channel[T] if you wanted to.

Note that the actuall time savings of this should be minimal. Unless you are securely erasing the files, "deleting" the files involves only dropping the FS table entry. A very low impact operation that should not mater against the heavy load that is uploading files.

Something you must realy take care with all kinds of multtiasking is not swallowing Exceptions. Usually you need to write bad code to swallow Exceptions - like a catch all or catch (Exception). With Multtiasking you can do it by accident. I have two articles on proper exception handling and why that would be bad.

Christopher
  • 9,634
  • 2
  • 17
  • 31
0

The use of spin locks (loops with delay waiting for a resource to be released) indicates design mistakes. I also think that deleting a local file is a very fast operation compared to uploading. A generic upload class could look like this:

public class AzureUploader
{
    public static async Task<AzureUploader> Connect()
    {
        var uploader = new AzureUploader();
        // Initialize your client with your credentials. Return initialized object.
        return uploader;
    }

    public Task UploadAndDelete(string filename)
    {
        return UploadToAzure(filename)
            .ContinueWith(task =>
                {
                    if (task.Status == TaskStatus.RanToCompletion)
                    {
                        Console.WriteLine("Upload was successful, now we can delete the file.");
                        File.Delete(filename);
                    }
                    else if (task.Status == TaskStatus.Faulted)
                    {
                        Console.WriteLine("An error occurred while uploading to Azure.");
                        Console.WriteLine(task.Exception.GetBaseException().Message);
                    }
                });
    }

    private async Task UploadToAzure(string filename)
    {
        // Closes the stream at the end of the method so that the file can be deleted.
        using var stream = File.OpenRead(filename);
        // Upload file to azure
    }
}

Usage:

var uploader = await AzureUploader.Connect();
await uploader.UploadAndDelete("myfile.ext");
Michael
  • 1,166
  • 5
  • 4