An asynchronous implementation of File.WriteAllLines

This is an example that (optionally) uses an extension method I wrote about back in April 2013. See the original post here. Last time around the post was about the caveats involved when working with streams. This time, I’m going to assume you know all about that, and not start by describing what you shouldn’t do…

The easy way

First, here’s the short answer. This is all you need for an asynchronous implementation of File.WriteAllLines() without using my extension method.

using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace Example
{
    public static class Constants
    {
        public const int BufferSize = 0x2000;
    }

    public static class FileAsync
    {
        /// <summary>Asynchronously creates a new file, writes a collection
        /// of strings to the file, and then closes the file.</summary>
        /// <param name="path">The file to write to.</param>
        /// <param name="contents">The lines to write to the file.</param>
        /// <returns>A Task that represents completion of the method.</returns>
        public static async Task WriteAllLinesAsync(string path, IEnumerable<string> contents)
        {
            await WriteAllLinesAsync(path, contents, Encoding.UTF8);
        }

        /// <summary>Asynchronously creates a new file by using the specified encoding,
        /// writes a collection of strings to the file, and then closes the file.</summary>
        /// <param name="path">The file to write to.</param>
        /// <param name="contents">The lines to write to the file.</param>
        /// <param name="encoding">The character encoding to use.</param>
        /// <returns>A Task that represents completion of the method.</returns>
        public static async Task WriteAllLinesAsync(string path, IEnumerable<string> contents, Encoding encoding)
        {
            using (var memoryStream = new MemoryStream(contents.SelectMany(s => encoding.GetBytes(s.EndsWith("\r\n") ? s : s + "\r\n")).ToArray()))
            {
                using (var stream = new FileStream(path, FileMode.Create, FileAccess.Write, FileShare.None, Constants.BufferSize, true))
                {
                    await memoryStream.CopyToAsync(stream, Constants.BufferSize);
                }
            }
        }
    }
}

As you can see, all that’s needed is to call the built-in Stream.CopyToAsync() method.

The only tricky part was to figure out how to convert an IEnumerable<string> such as a string array, into a byte array, in order to pass that to a MemoryStream constructor. Then simply copy the memory stream to a new file stream asynchronously.

The slightly harder way

(If you read the old post, you don’t need to read this. But if you’re lazy like me, here’s the extension method I wrote about last time.)

That code should be suitable most of the time. But for whatever reason, in my personal code that copied large streams often, not text files but image streams, especially when the images could be huge Photoshop files or Bitmaps, the built-in CopyToAsync() method just wasn’t reliable for me. (Or maybe I told myself so because writing a stream-to-stream copying method was more interesting.) So I implemented my own version, and I now always use that one because it has proved to be reliable. So I’ll now describe how to write your own Stream.CopyToAsync alternative method.

Also, this code can be used for many other async helper methods besides writing files asynchronously. An async WriteAllLines implementation is just one example. Another application of this technique that I use is reading streams asynchronously, and caching image contents, as well as keeping copies of image streams in the undo stack of my rudimentary image editor. Of course you could also go low-level and implement the Stream methods asynchronously yourself for even better control of them, but I am far too lazy for that, and copying streams asynchronously using a memory stream helper object suits all my needs pretty well.

The first method below is just a helper method, to format a size in bytes in a more friendly way, similarly to the way file sizes are displayed in Windows. This is only used for an informational message if the code needs to throw an exception.

public static class StreamExtensions
{
    private static string FormatBytes(long bytes)
    {
        const long KiloByte = 1024L;
        const long MegaByte = KiloByte * KiloByte;
        const long GigaByte = MegaByte * KiloByte;
        const long TeraByte = GigaByte * KiloByte;
        const long PetaByte = TeraByte * KiloByte;
        const long ExaByte = PetaByte * KiloByte;

        var formattedBytes = string.Empty;

        if (bytes < KiloByte)
            formattedBytes = string.Format("{0:F2} bytes", bytes);
        else if (bytes >= KiloByte && bytes < MegaByte)
            formattedBytes = string.Format("{0:F2} KB", Math.Round((double)bytes / KiloByte, 2, MidpointRounding.AwayFromZero));
        else if (bytes >= MegaByte && bytes < GigaByte)
            formattedBytes = string.Format("{0:F2} MB", Math.Round((double)bytes / MegaByte, 2, MidpointRounding.AwayFromZero));
        else if (bytes >= GigaByte && bytes < TeraByte)
            formattedBytes = string.Format("{0:F2} GB", Math.Round((double)bytes / GigaByte, 2, MidpointRounding.AwayFromZero));
        else if (bytes >= TeraByte && bytes < PetaByte)
            formattedBytes = string.Format("{0:F2} TB", Math.Round((double)bytes / TeraByte, 2, MidpointRounding.AwayFromZero));
        else if (bytes >= PetaByte && bytes < ExaByte)
            formattedBytes = string.Format("{0:F2} PB", Math.Round((double)bytes / PetaByte, 2, MidpointRounding.AwayFromZero));
        else if (bytes >= ExaByte)
            formattedBytes = string.Format("{0:F2} EB", Math.Round((double)bytes / ExaByte, 2, MidpointRounding.AwayFromZero));

        return formattedBytes;
    }
    /// <summary>An implementation to copy asynchronously from one stream to another,
    /// similar to <see cref="System.IO.Stream.CopyToAsync(Stream)"/></summary>
    /// <remarks>This was written because the default implementation would sometimes throw an OutOfMemoryException.</remarks>
    public static async Task CopyToStreamAsync(this Stream source, Stream destination, int bufferSize)
    {
        if (source == null)
            throw new ArgumentNullException("source");

        if (destination == null)
            throw new ArgumentNullException("destination");

        if (bufferSize <= 0)
            throw new ArgumentOutOfRangeException("bufferSize", "bufferSize must be greater than zero");

        /* The source stream may not support seeking; e.g. a stream
         * returned by ZipArchiveEntry.Open() or a network stream. */
        var size = bufferSize;
        var canSeek = source.CanSeek;

        if (canSeek)
        {
            try
            {
                size = (int)Math.Min(bufferSize, source.Length);
            }
            catch (NotSupportedException) { canSeek = false; }
        }

        var buffer = new byte[size];
        var remaining = canSeek ? source.Length : 0;

        /* If the stream is seekable, seek through it until all bytes are read.
         * If we read less than the expected number of bytes, it indicates an
         * error, so throw the appropriate exception.
         *
         * If the stream is not seekable, loop until we read 0 bytes. (It's not
         * an error in this case.) */
        while (!canSeek || remaining > 0)
        {
            var read = await source.ReadAsync(buffer, 0, size);

            if (read <= 0)
            {
                if (canSeek)
                    throw new EndOfStreamException(
                        string.Format("End of stream reached, but {0} remained to be read.",
                        FormatBytes(remaining)));
                else
                    break;
            }

            await destination.WriteAsync(buffer, 0, read);
            remaining -= canSeek ? read : 0;
        }
    }
}

The CopyToStreamAsync method is the one that does the work. All it really does is use Stream.ReadAsync and Stream.WriteAsync in a loop, with a bit of code to allow for streams that don’t support seeking. (This code is unchanged from my original post. For a better explanation of what it does and the issues it avoids, see my original post on this subject.)

To use this in the WriteAllLinesAsync method is a one-word change since it has the same signature as CopyToAsync, so I won’t bother to paste it here again.

Advertisements

About Jerome

I am a senior C# developer in Johannesburg, South Africa. I am also a recovering addict, who spent nearly eight years using methamphetamine. I write on my recovery blog about my lessons learned and sometimes give advice to others who have made similar mistakes, often from my viewpoint as an atheist, and I also write some C# programming articles on my programming blog.
This entry was posted in Programming and tagged . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s