0

I am using Parallel.ForEach to download multiple files in C# from google bucket to folder location. I'm using retry logic so it can retry downloading files in case files download fails during downloading. How can I apply retry logic for each file or each thread in Parallel.ForEach loop.

Parallel.ForEach(listFiles, objectName =>            
{
    retryCount = 0;                        
    countOfFiles++;
    downloadSuccess = false;
    bucketFileName = Path.GetFileName(objectName.Name);
    guidFolderPath = tempFolderLocation + "\\" + bucketFileName;

    while (retryCount < retryCountInput && downloadSuccess == false)
    {
        try
        {
            FileStream fs = new FileStream(guidFolderPath, FileMode.Create, FileAccess.Write, FileShare.Write);
            using (fs)
            {                                               
                storage.DownloadObjectAsync(bucketName, objectName.Name, fs, option, cancellationToken, progress).Wait();
            }
        }
        catch (Exception ex)
        {
            Console.WriteLine("Exception occured while downloading file: " + ex.ToString());                   
            Thread.Sleep(RetryInterval(retryCount, minBackoffTimeSpan, maxBackoffTimeSpan, deltaBackoffTimeSpan));
            retryCount++;

        }
    }
}
Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104

2 Answers2

1

I would change it to tasks and use the async. This way your Thread.Sleep doesn't block a threadpool thread. The Parallel.ForEach is for CPU bound work.

Something like: (I'm unable to compile/test this without the rest of your code)

int retryCountInput = 5;
var tasks = new List<Task>();

foreach (var file in listFiles)
{
    var task = Task.Run(async () =>
    {
        // make it local
        int retryCount = 0;
        string bucketFileName = Path.GetFileName(objectName.Name);
        string guidFolderPath = tempFolderLocation + "\\" + bucketFileName;

        while (retryCount < retryCountInput)
        {
            try
            {
                using (var fs = new FileStream(guidFolderPath, FileMode.Create, FileAccess.Write, FileShare.Write))
                    // Use await here, instead of `Wait()` so this threadpool thread
                    // can be used for other tasks.
                    await storage.DownloadObjectAsync(bucketName, objectName.Name, fs, option, cancellationToken, progress);

                break;
            }
            catch (Exception ex)
            {
                Console.WriteLine("Exception occured while downloading file: " + ex.ToString());

                // Use Task.Delay here, so this thread is 'released'
                await Task.Delay(RetryInterval(retryCount, minBackoffTimeSpan, maxBackoffTimeSpan, deltaBackoffTimeSpan));
                retryCount++;
            }
        }
    });
    tasks.Add(task);
}
await Task.WhenAll(tasks);
Jeroen van Langen
  • 21,446
  • 3
  • 42
  • 57
  • Hi @Jeroen, I have modified my code and removed parallel.foreach instead using foreach loop to iterate through files. But now, I'm not able to find all files in downloaded path. No of downloaded files in download path changes and this behavior seems random. can I use Task.run for IO operations? – user3934763 Apr 23 '20 at 06:01
0

I have modified my code and removed Parallel.ForEach instead using foreach loop to iterate through files. But now, I'm not able to find all files in downloaded path, though logs shows all files got downloaded. Number of downloaded files in download path changes and this behavior seems random. Can I use Task.Run for I/O operations?

var tasks = new List<Task>();
foreach (var objectName in listFiles)
{
    var task = Task.Run(() =>
    {
        downloadSuccess = false;
        bucketFileName = Path.GetFileName(objectName.Name);
        guidFolderPath = tempFolderLocation + "\\" + bucketFileName;

        var maxRetryAttempts = 3;
        var pauseBetweenFailures = TimeSpan.FromSeconds(2);
        RetryHelper.RetryOnException(maxRetryAttempts, pauseBetweenFailures, async () =>
        {
            FileStream fs = new FileStream(guidFolderPath, FileMode.Create,
                FileAccess.Write, FileShare.Write);
            using (fs)
            {
                var progress = new Progress<IDownloadProgress>(
                    p =>
                    {
                        DownloadProgress(p, retryCount, objectName.Name);
                    });

                await client.DownloadObjectAsync(bucketName, objectName.Name,
                    fs, option, cancellationToken.Token, progress);
            }
        });
    });
    tasks.Add(task);
}
await Task.WhenAll(tasks);
Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
  • Yes, you can certainly use `Task.Run` for I/O operations. The recommended pattern in this case is to pass an async delegate as argument, like it's done in Jeroen van Langen's [answer](https://stackoverflow.com/a/61055858/11178549). What is not clear in your code is the signature of the `RetryHelper.RetryOnException` method. If it returns a `Task`, then you are supposed to `await` this task, otherwise it will run in fire-and-forget fashion. – Theodor Zoulias Apr 23 '20 at 06:31
  • RetryHelper.RetryOnException doesn't return task. It has void return type. – user3934763 Apr 23 '20 at 12:43
  • What about its third parameter? Is it of type `Func`? If not, you have an [`async void`](https://learn.microsoft.com/en-us/archive/msdn-magazine/2013/march/async-await-best-practices-in-asynchronous-programming#avoid-async-void) there. – Theodor Zoulias Apr 23 '20 at 18:55