Spot the bug! An example of how to use the TaskCompletionSource<TResult> in C#

I have a habit of making several unrelated changes to projects around the same time. That works out fine in a controlled environment, like at work where there is source control, but it’s not so cool at home where I don’t have any source control installed, and I develop my “pet program”, used in most my examples here, which I also use as my code prototyping platform – and then later migrate pieces of it into work code.

Again, the source, for this and a whole bunch of other stuff, is here: RomyView.zip

Sometimes, some seemingly harmless code changes can introduce a subtle bug. That’s what happened here. It took me three days to find the problem because I was looking in the wrong place completely. I’ve probably given the game away in this post’s title, but anyway, do you see the bug in this code?

This code can deadlock! Don’t use it.
  1. public static class Example
  2. {
  3.     internal static readonly IEnumerable<string> NoFiles = Enumerable.Empty<string>();
  4.  
  5.     public static async Task<IEnumerable<string>> FindFilesAsync(string path, string searchFilter, SearchOption option = SearchOption.TopDirectoryOnly)
  6.     {
  7.         var bag = new ConcurrentBag<string>();
  8.         var usingFilter = !string.IsNullOrEmpty(searchFilter);
  9.  
  10.         Func<string, Task<bool>> addFileToBag = f =>
  11.         {
  12.             var tcs = new TaskCompletionSource<bool>();
  13.             try
  14.             {
  15.                 if (!usingFilter || Wildcard.IsMatch(Path.GetFileName(f).ToLowerInvariant(), searchFilter.ToLowerInvariant()))
  16.                 {
  17.                     bag.Add(f);
  18.                     tcs.TrySetResult(true);
  19.                 }
  20.             }
  21.             catch (Exception ex) { tcs.TrySetException(ex); }
  22.  
  23.             return tcs.Task;
  24.         };
  25.  
  26.         try
  27.         {
  28.             var degreeOfParallelism = Environment.ProcessorCount * 2;
  29.  
  30.             try
  31.             {
  32.                 await Directory.GetFiles(path, “*”, option).ForEachAsync(degreeOfParallelism, f =>
  33.                 {
  34.                     return addFileToBag(f);
  35.                 });
  36.             }
  37.             catch (UnauthorizedAccessException) { }
  38.             catch (IOException) { }
  39.         }
  40.         catch (Exception ex) { ex.Log(); }
  41.  
  42.         return bag.Count > 0 ? bag.ToArray() : NoFiles;
  43.     }
  44. }

 

The above code performs a search for files, then asynchronously filters the results based on the search filter, using a ForEachAsync method that was posted on the Parallel Programming blog to populate a concurrent collection; then returns the collection’s results. (Notice it uses a ConcurrentBag<T>, so the results are unordered. In the bigger picture, there are also methods that return ordered results after performing a parallel sort, but that isn’t relevant to this post.)

The problem lies, as the title implies, in my use of the TaskCompletionSource<TResult> type. This is a very handy type, which allows us to implement some asynchronous code however we need, then “populate” and return a Task representing the asynchronous operation’s eventual completion.

But when using the TaskCompletionSource<TResult>, you need to make sure you set it’s Task that will be returned appropriately for all possible cases relevant to your particular code. I didn’t do that too well in the case above. Can you see what will happen if a search filter string is passed in, and there are any files that do not match?

The nested function is of type Func<string, Task<bool>>. That is, you pass in a filter string that may contain wildcards or may be empty if there is no search filter, and it returns a Task that indicates whether the file was added to the collection. When the calling code awaits the nested function, the compiler “lifts” the Task<bool> to a bool value, so the code can be treated like a synchronous function that had the signature Func<string, bool>. But I didn’t set the result for any unmatched files. Thus, if a search filter is passed in and any files don’t match it, this method will deadlock. That is, it will never return. If I hadn’t used a TaskCompletionSource<bool> (but instead marked the Func as async, such that I could just return a bool in the code), or if the function was synchronous, with a Func<string, bool> signature, the code would have resulted in a compiler error, bringing this to my attention.

In retrospect, this might seem pretty obvious, but I looked at the code for 3 days without finding the problem. The point is, if you use a TaskCompletionSource<TResult> type, make sure you set the return value for its underlying Task correctly. (Which might also involve handling Cancellation, which wasn’t necessary here.)

Here is the same code with the bug fixed:

Fixed version
  1. public static async Task<IEnumerable<string>> FindFilesAsync(string path, string searchFilter, SearchOption option = SearchOption.TopDirectoryOnly)
  2. {
  3.     var bag = new ConcurrentBag<string>();
  4.     var usingFilter = !string.IsNullOrEmpty(searchFilter);
  5.  
  6.     /* The Task returned contains true if a file was
  7.      * added to the collection; otherwise false. */
  8.     Func<string, Task<bool>> addFileToBag = f =>
  9.     {
  10.         var tcs = new TaskCompletionSource<bool>();
  11.         try
  12.         {
  13.             if (!usingFilter || Wildcard.IsMatch(Path.GetFileName(f).ToLowerInvariant(), searchFilter.ToLowerInvariant()))
  14.             {
  15.                 bag.Add(f);
  16.                 tcs.TrySetResult(true);
  17.             }
  18.             else // Don’t forget to set the result to false if a file was not added; or the calling await will deadlock.
  19.             {
  20.                 tcs.TrySetResult(false);
  21.             }
  22.         }
  23.         catch (Exception ex) { tcs.TrySetException(ex); }
  24.  
  25.         return tcs.Task;
  26.     };
  27.  
  28.     try
  29.     {
  30.         var degreeOfParallelism = Environment.ProcessorCount * 2;
  31.  
  32.         try
  33.         {
  34.             await Directory.GetFiles(path, “*”, option).ForEachAsync(degreeOfParallelism, f =>
  35.             {
  36.                 return addFileToBag(f);
  37.             });
  38.         }
  39.         catch (UnauthorizedAccessException) { }
  40.         catch (IOException) { }
  41.     }
  42.     catch (Exception ex) { ex.Log(); }
  43.  
  44.     return bag.Count > 0 ? bag.ToArray() : NoFiles;
  45. }
Advertisements

About Jerome

I am a senior C# developer in Johannesburg, South Africa. I am also a recovering addict, who spent nearly eight years using methamphetamine. I write on my recovery blog about my lessons learned and sometimes give advice to others who have made similar mistakes, often from my viewpoint as an atheist, and I also write some C# programming articles on my programming blog.
This entry was posted in Programming and tagged , , , . Bookmark the permalink.

One Response to Spot the bug! An example of how to use the TaskCompletionSource<TResult> in C#

  1. Carl Cubillas says:

    Great example, will definitely have to keep this in mind. Had some trouble finding the issue until Resharper helped me out 😉

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s