0

I have around 500 JPEG images on a removable media device. My desktop app (.NET 4.5) is one Winforms form that currently contains FileInfo objects for these images in a List by using Directory.EnumerateFiles. No problem there and very quick. The main intent is to take all these files and upload them to an S3 bucket while also doing a bit of logging via a REST API call and reporting progress back to the user for all the files, and of course when the whole file set is done uploading.

How can I take this List of FileInfo objects and process them the most efficiently while also updating a progress bar and allowing the user to move the form without it freezing? Doing things in a simple ForEach loop obviously is slow. Processing each file involves uploading the image to an S3 bucket if certain metadata fields exist, writing to a REST API to store a record in a SQL database, then updating the UI to notify the user of progress as well as flagging the file in a visual data grid as "done". I can do all that code fine, but unsure of how to go through this list of files simultaneously while not causing UI issues on the form.

My real question: I heard many mention Parallel.ForEach, TPL, using Tasks, Async/Await, and I'm struggling to understand which is the best option for my use case, and how to go about it for updating the UI/progressbar without problems.

Andy
  • 1,243
  • 3
  • 22
  • 40
  • 1
    Possible duplicate of [C# .Net Freeze while iterating through large number of files](https://stackoverflow.com/questions/38083668/c-sharp-net-freeze-while-iterating-through-large-number-of-files) – Ňɏssa Pøngjǣrdenlarp May 17 '19 at 23:22

1 Answers1

2

Because this is an IO bound workload, and your libraries probably support async, then the async and await pattern is the way to go. This will allow efficient use of thread-pool threads while the OS deals with the IO Completion ports, meaning while work is being offloaded to IO bound operations the thread-pool can reuse this valuable resource while the Pattern also manages continuations and the synchronization context so you can update the UI

Since Parallel.For/ForEach does not support the async await pattern and will be ultimately inefficient, the simplest way to go would be Task.Run and Task.WhenAll with async methods.

However, i would also take a look at ActionBlock<T> in the Microsoft TPL DataFlow library This will give you the best of both worlds of being able to use the async and await pattern, and also limit the max degrees of parallelism.

Another option would be reactive extensions, which has all these goodies as well

TheGeneral
  • 79,002
  • 9
  • 103
  • 141