Reliably asynchronously reading and writing binary Streams in c# (Always check method call return values)

Every so often, I get brutally reminded not to think I am too clever. Indeed, one thing I’ve learned as a developer is that there is always more to learn, and I should be humble, because tomorrow, I will learn something from a more experienced developer, something that makes me an idiot today.

For a change, the application that I use in almost all my articles has not yet been updated with the code presented here, because of other code changes still in progress. The application is currently not in a state that’s fit to release. RomyView.zip

Case in point, yesterday I read an article written several years ago by Jon Skeet, on reading binary data. The following is an excerpt from his original article:

In the C# newsgroup, I’ve seen quite a lot of code for reading in data from a file like this:

// Bad code! Do not use!
FileStream fs = File.OpenRead(filename);
byte
[] data =
new
byte
[fs.Length]; fs.Read (data, 0, data.Length);

This code is far from guaranteed to work. In particular, the FileStream could be reading just the first 10 bytes of the file into the buffer. The Read method is only guaranteed to block until some data is available (or the end of the stream is reached), not until all of the data is available. That’s where the return value (which is ignored in the above code) is vital. You need to cope with the case where you can’t read all of the data in one go, and loop round until you’ve read what you want.

Oh, nooooo. One of my core methods is little more than an asynchronous version of the bad code. This is an extension method that copies one stream to another asynchronously, doing essentially the same as Stream.CopyToAsync except that it allows setting the buffer size. (While that may sound unnecessary, it solved some issues for me – opening huge bitmap files with the built-in method can throw OutOfMemoryExceptions far too easily for my liking, whereas my method works well, due to using a smaller buffer.)

I use this method indirectly throughout my “pet” application, and everywhere else where I’ve used code from it, so this affects production code (because my methods such as FileAsync.ReadAllBytesAsync, FileAsync.WriteAllBytesAsync and so on) all use this method internally. Of course copying one stream to another is a very generic operation, in the true sense of the word, so this is called when the application saves any kind of image file, which I always do asynchronously, or opening any image file, which I sometimes do asynchronously, most notably for status info in the background, when the user isn’t doing anything and really doesn’t expect obscure failures or crashes.

Copying one stream to another asynchronously – Bad Code
  1. public static async Task CopyToStreamAsync(this Stream source, Stream destination, int bufferSize)
  2. {
  3.     if (source == null)
  4.         throw new ArgumentNullException(“source”);
  5.  
  6.     if (destination == null)
  7.         throw new ArgumentNullException(“destination”);
  8.  
  9.     if (bufferSize <= 0)
  10.         throw new ArgumentOutOfRangeException(“bufferSize”, “bufferSize must be greater than zero”);
  11.  
  12.     var bytesRead = 0;
  13.     var buffer = new byte[bufferSize];
  14.  
  15.     while ((bytesRead = await Task<int>.Factory.FromAsync(source.BeginRead, source.EndRead, buffer, 0, bufferSize, null)) > 0)
  16.     {
  17.         await Task.Factory.FromAsync(destination.BeginWrite, destination.EndWrite, buffer, 0, bytesRead, null);
  18.     }
  19. }

 

Granted, the bad code has never failed me. It has always managed to read the source stream and write the exact data to the destination stream. But it relies on implementation details – whatever BeginRead, EndRead, BeginWrite and EndWrite do, which is probably calling ReadAsync and WriteAsync, and maybe I can get away with not checking how many bytes were read because of the pattern used. But relying on implementation details and maybes is risky.

An improvement

Here is a better implementation, but it still has a major flaw. Can you see what the problem is?

Copying one stream to another asynchronously – improved
  1. public static async Task CopyToStreamAsync(this Stream source, Stream destination, int bufferSize)
  2. {
  3.     if (source == null)
  4.         throw new ArgumentNullException(“source”);
  5.  
  6.     if (destination == null)
  7.         throw new ArgumentNullException(“destination”);
  8.  
  9.     if (bufferSize <= 0)
  10.         throw new ArgumentOutOfRangeException(“bufferSize”, “bufferSize must be greater than zero”);
  11.  
  12.     var size = Convert.ToInt32(Math.Min(bufferSize, source.Length));
  13.     var buffer = new byte[size];
  14.     var remaining = source.Length;
  15.  
  16.     while (remaining > 0)
  17.     {
  18.         var read = await source.ReadAsync(buffer, 0, size);
  19.  
  20.         if (read <= 0)
  21.         {
  22.             throw new EndOfStreamException(string.Format(“End of stream reached, but {0} remained to be read.”, FileUtils.FormatBytes(remaining)));
  23.         }
  24.  
  25.         await destination.WriteAsync(buffer, 0, read);
  26.         remaining -= read;
  27.     }
  28. }

 

If you read Jon Skeet’s article that I quoted up front, it should be obvious… The code foolishly assumes that the source stream supports seeking. Not all streams do. In fact, the Length and Position properties, as well as the SetLength And Seek methods are abstract on the base Stream class, and by convention, streams that do not support seeking will throw a NotSupportedException if any of those are called.

Ironically the “bad code” did not have any such problem, because the Task<T>.Factory.FromAsync method call wraps up all those little details for us.

The final method

In a way, it was actually easier being blissfully ignorant of the niggly Stream details. The Task<T>.Factory.FromAsync calls (i.e. the bad code) will probably always work. But still, it’s better to understand what you’re doing, isn’t it?

Copying one stream to another asynchronously final – working for seeking and non-seeking streams
  1. public static async Task CopyToStreamAsync(this Stream source, Stream destination, int bufferSize)
  2.         {
  3.             if (source == null)
  4.                 throw new ArgumentNullException(“source”);
  5.  
  6.             if (destination == null)
  7.                 throw new ArgumentNullException(“destination”);
  8.  
  9.             if (bufferSize <= 0)
  10.                 throw new ArgumentOutOfRangeException(“bufferSize”, “bufferSize must be greater than zero”);
  11.  
  12.             /* The source stream may not support seeking; e.g. a stream
  13.              * returned by ZipArchiveEntry.Open() or a network stream. */
  14.             var size = bufferSize;
  15.             var canSeek = source.CanSeek;
  16.  
  17.             if (canSeek)
  18.             {
  19.                 try
  20.                 {
  21.                     size = Convert.ToInt32(Math.Min(bufferSize, source.Length));
  22.                 }
  23.                 catch (NotSupportedException) { canSeek = false; }
  24.             }
  25.  
  26.             var buffer = new byte[size];
  27.             var remaining = canSeek ? source.Length : 0;
  28.  
  29.             /* If the stream is seekable, seek through it until all bytes are read.
  30.              * If we read less than the expected number of bytes, it indicates an
  31.              * error, so throw the appropriate exception.
  32.              *
  33.              * If the stream is not seekable, loop until we read 0 bytes. (It’s not
  34.              * an error in this case.) */
  35.             while (!canSeek || remaining > 0)
  36.             {
  37.                 var read = await source.ReadAsync(buffer, 0, size);
  38.  
  39.                 if (read <= 0)
  40.                 {
  41.                     if (canSeek)
  42.                         throw new EndOfStreamException(
  43.                             string.Format(“End of stream reached, but {0} remained to be read.”,
  44.                             FileUtils.FormatBytes(remaining)));
  45.                     else
  46.                         break;
  47.                 }
  48.  
  49.                 await destination.WriteAsync(buffer, 0, read);
  50.                 remaining -= canSeek ? read : 0;
  51.             }
  52.         }

 

I was a little lazy when I wrote this method… I didn’t feel like writing either two very similar loops in the same method, or two very similar methods, so I combined the logic for both into a single while loop, thus saving myself some typing.

This implementation behaves just like the previous one, for a source stream that supports seeking. That is, it checks the return value of the Stream.ReadAsync method, and iterates until all the binary data has been copied, and in the event of less than the expected number of bytes being read, it throws an EndOfStreamException.

If the source stream does not support seeking (hence the stream length is unknown), the while loop essentially evaluates to while (true), and continues to iterate until no more data can be read.

Advertisements

About Jerome

I am a senior C# developer in Johannesburg, South Africa. I am also a recovering addict, who spent nearly eight years using methamphetamine. I write on my recovery blog about my lessons learned and sometimes give advice to others who have made similar mistakes, often from my viewpoint as an atheist, and I also write some C# programming articles on my programming blog.
This entry was posted in Programming and tagged , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s