in a test WPF project I am trying to use TPL dataflow to enumerate through all subdirectories of a given parent directory and create a list of files with a particular file extension e.g. ".xlsx". I use 2 blocks, the first dirToFilesBlock, and the last, fileActionBlock.
To create the recursive effect of going through all subdirectories the first block has a link back to itself with the link predicate testing to see if the output item is a directory. This is an approach I found in a book on Asynchronous Programming. The second link is to the fileActionBlock which then adds the file to a list, based on the link predicate testing to see the file has the correct extension.
The problem I am having is after kicking things off with btnStart_Click, it never finishes. That is, we never reach below the await in the event handler to show the “Completed” message. I understand that I probably need to call dirToFilesBlock.Complete(), but I don’t know where in the code this should be and under what conditions? I can't call it after the initial post as it would stop to back link from giving subdirectories. I’ve tried doing things with the InputCount and OutputCount properties but didn’t get very far. I would like, if possible,to keep the structure of the dataflow as it stands as it means I can also update the UI with each new directory to be explored via the link back to give the user some feedback on progress.
I’m very new to TPL dataflow and any help is gratefully received.
Here is the code from the code behind file:
public partial class MainWindow : Window
{
TransformManyBlock<string, string> dirToFilesBlock;
ActionBlock<string> fileActionBlock;
ObservableCollection<string> files;
CancellationTokenSource cts;
CancellationToken ct;
public MainWindow()
{
InitializeComponent();
files = new ObservableCollection<string>();
lst.DataContext = files;
cts = new CancellationTokenSource();
ct = cts.Token;
}
private Task Start(string path)
{
var uiScheduler = TaskScheduler.FromCurrentSynchronizationContext();
dirToFilesBlock = new TransformManyBlock<string, string>((Func<string, IEnumerable<string>>)(GetFileSystemItems), new ExecutionDataflowBlockOptions() { CancellationToken = ct });
fileActionBlock = new ActionBlock<string>((Action<string>)ProcessFile, new ExecutionDataflowBlockOptions() {CancellationToken = ct, TaskScheduler = uiScheduler});
// Order of LinkTo's important here!
dirToFilesBlock.LinkTo(dirToFilesBlock, new DataflowLinkOptions() { PropagateCompletion = true }, IsDirectory);
dirToFilesBlock.LinkTo(fileActionBlock, new DataflowLinkOptions() { PropagateCompletion = true }, IsRequiredDocType);
// Kick off the recursion.
dirToFilesBlock.Post(path);
return Task.WhenAll(dirToFilesBlock.Completion, fileActionBlock.Completion);
}
private bool IsDirectory(string path)
{
return Directory.Exists(path);
}
private bool IsRequiredDocType(string fileName)
{
return System.IO.Path.GetExtension(fileName) == ".xlsx";
}
private IEnumerable<string> GetFilesInDirectory(string path)
{
// Check for cancellation with each new dir.
ct.ThrowIfCancellationRequested();
// Check in case of Dir access problems
try
{
return Directory.EnumerateFileSystemEntries(path);
}
catch (Exception)
{
return Enumerable.Empty<string>();
}
}
private IEnumerable<string> GetFileSystemItems(string dir)
{
return GetFilesInDirectory(dir);
}
private void ProcessFile(string fileName)
{
ct.ThrowIfCancellationRequested();
files.Add(fileName);
}
private async void btnStart_Click(object sender, RoutedEventArgs e)
{
try
{
await Start(@"C:\");
// Never gets here!!!
MessageBox.Show("Completed");
}
catch (OperationCanceledException)
{
MessageBox.Show("Cancelled");
}
catch (Exception)
{
MessageBox.Show("Unknown err");
}
finally
{
}
}
private void btnCancel_Click(object sender, RoutedEventArgs e)
{
cts.Cancel();
}
}
}