1

I have a button in WinForms which imports from a text file. Once it starts importing, it changes the text of the button to Processing... and instantiates a CancellationToken which is supposed to cancel the operation if the user requests. If the button is pressed again while it's still importing, it should activate the task cancellation.

There are many ways to do that and I can't decide which one to use. What is the cleanest way?

private CancellationTokenSource _cts;
private List<string> _list = new List<string>();

private async void Button_Click(object sender, EventArgs e)
{
    if (button.Text == "Processing...")
    {
        if (_cts != null)
        {
            _cts.Cancel();
            _cts.Dispose();
        }
    }
    else
    {
        using var dialog = new OpenFileDialog
        {
            Filter = "All Files (*.*)|*.*"
        };

        if (dialog.ShowDialog() == DialogResult.OK)
        {
            button.Text = "Processing...";
            
            _cts = new CancellationTokenSource();

            var fileStream = dialog.OpenFile();

            using var reader = new StreamReader(fileStream);

            try
            {
                int count = 0;
                while (!reader.EndOfStream)
                {
                    var line = await reader.ReadLineAsync().WithCancellation(_cts.Token).ConfigureAwait(false);

                    Trace.WriteLine(line);
                    _list.Add(line);

                    count++;
                }
            }
            catch (TaskCanceledException)
            {
            }

            //await Task.Run(async () =>
            //{
            //    int count = 0;
            //    while (!reader.EndOfStream)
            //    {
            //        var line = await reader.ReadLineAsync().ConfigureAwait(false);
            //        //var line = reader.ReadLine();

            //        Trace.WriteLine(line);

            //        count++;
            //    }
            //});

            //await Task.Factory.StartNew(() =>
            //{
            //    int count = 0;
            //    while (!reader.EndOfStream)
            //    {
            //        var line = reader.ReadLine();

            //        Trace.WriteLine(line);

            //        count++;
            //    }
            //});

            BeginInvoke(new Action(() => button.Text = "Process"));
        }
    }
}

public static class ThreadExtensionMethods
{
    public static Task<T> WithCancellation<T>(this Task<T> task, CancellationToken cancellationToken)
    {
        return task.IsCompleted // fast-path optimization
            ? task
            : task.ContinueWith(
                completedTask => completedTask.GetAwaiter().GetResult(),
                cancellationToken,
                TaskContinuationOptions.ExecuteSynchronously,
                TaskScheduler.Default);
    }
}
nop
  • 4,711
  • 6
  • 32
  • 93
  • 1
    What if you just used `var fileLines = await File.ReadAllLinesAsync(dialog.FilePath, _cts.Token);`? – Camilo Terevinto Aug 09 '20 at 09:53
  • How much time needed for reading which forces you to apply `CancellationToken`? – aepot Aug 09 '20 at 09:54
  • @CamiloTerevinto, it should be reading line by line because the files are huge. – nop Aug 09 '20 at 10:00
  • @aepot, the files are huge. I picked StreamReader and this particular way because it's the fastest in this case. https://cc.davelozinski.com/c-sharp/fastest-way-to-read-text-files – nop Aug 09 '20 at 10:00
  • 2
    @CamilioTerevinto few notes: 1) it's .NET Core API 2) it's [broken API](https://stackoverflow.com/q/63217657/12888024) – aepot Aug 09 '20 at 10:01
  • @aepot, it works fine in .NET Core 3.1 (doesn't block the UI). – nop Aug 09 '20 at 10:02
  • 2
    How huge each line? Is there `CancellationToken` needed while line is reading? – aepot Aug 09 '20 at 10:03
  • @aepot, It doesn't matter when the CancellationToken is triggered (before or after the ReadLine). It just needs to support cancellation operation. Each line is like 2 Guids. – nop Aug 09 '20 at 10:12
  • @aepot I didn't know that it's "broken", so thanks for that, but keep in mind that this code can perfectly run in .NET Core 3.x with WinForms enabled, so that point is unneeded. – Camilo Terevinto Aug 09 '20 at 10:23
  • Do you really need to load all lines in memory? Can't you process the file line by line? – Theodor Zoulias Aug 09 '20 at 10:46
  • I've added an answer but there's an open question (thx @TheodorZoulias) : how do you want to handle the loaded data? – aepot Aug 09 '20 at 10:58
  • @TheodorZoulias, I'm processing it line by line. Using ReadLineAsync, not ReadToEndAsync – nop Aug 09 '20 at 10:59
  • @aepot, the processed data will be added into a ListBox. – nop Aug 09 '20 at 11:00
  • Whole huge file into a `ListBox`? I'm not sure but it can hang the UI. – aepot Aug 09 '20 at 11:01
  • @aepot, it won't but an example with a `List` is just fine. I mean the most important is the List, the ListBox is not necessary. – nop Aug 09 '20 at 11:01
  • 2
    `List` is not Thread-safe. I'll change the answer now to `List` but be sure that you're not doing anything with it while the data is loading. – aepot Aug 09 '20 at 11:03
  • @aepot, keep it so. – nop Aug 09 '20 at 11:04
  • If you are processing the file line by line, then why are you storing the lines into the `private List _list` [field](https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/classes-and-structs/fields)? – Theodor Zoulias Aug 09 '20 at 20:32
  • @Theodor Zoulias, because I'm actually adding the results to a ListBox. – nop Aug 12 '20 at 16:39
  • You are adding in item in the `ListBox` for each line of the file? How many lines are in the file on average? – Theodor Zoulias Aug 12 '20 at 17:01

1 Answers1

2

As FileStream.ReadAsync is slow, i would not recommend it. It's underlyuing method for entire async File System API. Let's run it in the Task.

private CancellationTokenSource _cts;
private List<string> _list = new List<string>();

private async void button1_Click(object sender, EventArgs e)
{
    if (_cts != null)
    {
        _cts.Cancel();
    }
    else
    {
        using var dialog = new OpenFileDialog
        {
            Filter = "All Files (*.*)|*.*"
        };

        if (dialog.ShowDialog() == DialogResult.OK)
        {
            button.Text = "Processing...";
            using (_cts = new CancellationTokenSource())
            {
                await Task.Run(() =>
                {
                    using var reader = new StreamReader(dialog.OpenFile());
                    int count = 0;
                    while (!reader.EndOfStream && !_cts.Token.IsCancellationRequested)
                    {
                        var line = reader.ReadLine();

                        Trace.WriteLine(line);
                        _list.Add(line);

                        count++;
                    }
                });
            }
            _cts = null;
            button.Text = "Process";
        }
    }
}

Note that List is not Thread-safe. Be sure that you're not doing anything with it while the data is loading. If you do, consider some collection from System.Collections.Concurrent namespace.

Also you may avoid proxy collection _list.Add(line) here and process data inline or add it to the ListBox as you said in comments. Something like this: this.Invoke((Action)(() => listBox.Add(line))).

aepot
  • 4,558
  • 2
  • 12
  • 24
  • What is the reason for using a `ConcurrentBag` instead of a `List`? – Theodor Zoulias Aug 09 '20 at 10:43
  • @TheodorZoulias Because it's running on a pooled `Thread`. Guessing that OP want to read the collection while file is loading. If not, the `List` is applicable. – aepot Aug 09 '20 at 10:45
  • My own guess is that the file content is stored in the `_list` field after a bad design decision. Assuming that the form is the main form of the application, a field of this form is effectively a global variable. No need to store intermediate data of transient operations into global variables! – Theodor Zoulias Aug 09 '20 at 10:58
  • 1
    @TheodorZoulias agree. Asked for the details it in comments. – aepot Aug 09 '20 at 10:59
  • 1
    There's no need at all for `Interlocked.Increment(ref count);`, this code doesn't run in multiple threads at the same time. Same reason why using a Concurrent collection would be pointless. And there is no need to throw just to catch the exception and do nothing with it, just finish the method and avoid the performance hit. – Camilo Terevinto Aug 09 '20 at 12:06
  • 1
    @CamiloTerevinto why i thught tha `count` is some global variable? Don't know, fixed. Exception is not necessary here but it will be ttown one time on `Cancel()`. Thus there will be no performance impact at all. But that makes sense, fixed that too. Thank you! – aepot Aug 09 '20 at 12:33