1

I need to change a file's encoding. The method that I've used loads all the file in memory:

string DestinationString = Encoding.ASCII.GetString(Encoding.ASCII.GetBytes(File.ReadAllText(FileName)));
File.WriteAllText(FileName, DestinationString, new System.Text.ASCIIEncoding());

This works for smaller files (in case that I want to change the file's encoding to ASCII), but it won't be ok with files larger than 2 GB. How to change the encoding without loading all the file's content in memory?

Buda Gavril
  • 21,409
  • 40
  • 127
  • 196

1 Answers1

3

You can't do so by writing to the same file - but you can easily do it to a different file, just by reading a chunk of characters at a time in one encoding and writing each chunk in the target encoding.

public void RewriteFile(string source, Encoding sourceEncoding,
                        string destination, Encoding destinationEncoding)
{
    using (var reader = File.OpenText(source, sourceEncoding))
    {
        using (var writer = File.CreateText(destination, destinationEncoding))
        {
            char[] buffer = new char[16384];
            int charsRead;
            while ((charsRead = reader.Read(buffer, 0, buffer.Length)) > 0)
            {
                writer.Write(buffer, 0, charsRead);
            }
        }
    }
}

You could always end up with the original filename via renaming, of course.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • 1
    Side note: clearly one can change encoding *to* ASCII in place as it is fixed width encoding and requires no more bytes than any other encoding (whether it worth doing is different story) – Alexei Levenkov Mar 11 '16 at 07:18
  • 2
    @AlexeiLevenkov: Potentially, but it would be tricky as the file could end up shrinking. I'd definitely encourage the approach I've suggested here over that. – Jon Skeet Mar 11 '16 at 07:20