I have a very simple xml file, calculating it's size in bytes and a SHA1 hash. Why do the three ways below give different results (size difference gets larger as the file size gets larger.)
Simple.xml file
<note>
<to>Test</to>
<from>TestTest</from>
<heading>TestTestTest</heading>
<body>TestTestTestTestTest</body>
</note>
Three ways to get size and SHA1 hash.
void Main()
{
var algorithm = new SHA1Managed();
foreach (var file in new[] { @"C:\temp\simple.xml" })
{
Console.WriteLine($"--------- For Input file {file} ---------");
Console.WriteLine("Case 1 - Using ReadAllBytes");
var bytes = File.ReadAllBytes(file);
Console.WriteLine($"Size: {bytes.Length}");
Console.WriteLine($"SHA1 Hash: {BitConverter.ToString(algorithm.ComputeHash(bytes)).Replace("-", string.Empty)}");
Console.WriteLine("\n\nCase 2 - Using XmlDocument.OuterXml");
XmlDocument doc = new XmlDocument();
doc.Load(file);
bytes = Encoding.UTF8.GetBytes(doc.OuterXml);
Console.WriteLine($"Size: {bytes.Length}");
Console.WriteLine($"SHA1 Hash: {BitConverter.ToString(algorithm.ComputeHash(bytes)).Replace("-", string.Empty)}");
Console.WriteLine("\n\nCase 3 - XDocument.ToString()");
var xdoc = XDocument.Load(file);
bytes = Encoding.UTF8.GetBytes(xdoc.ToString());
Console.WriteLine($"Size: {bytes.Length}");
Console.WriteLine($"SHA1 Hash: {BitConverter.ToString(algorithm.ComputeHash(bytes)).Replace("-", string.Empty)}");
}
}
Results:
Case 1 - Using ReadAllBytes
Size: 121
SHA1 Hash: 94AD4DCFD700EB139796F6B0EEB11658B57AD57A
Case 2 - Using XmlDocument.OuterXml
Size: 111
SHA1 Hash: EC2979C571F07B2FDC186C4229A2C6CD677BBF8A
Case 3 - XDocument.ToString()
Size: 129
SHA1 Hash: 7236C0AD4279D9FCB0E3DFBA11B833B129032354