0

We have an application that generates documents. To help us in this process we use OpenXml.

For unit test purposes we want to see if the generated document is the same as the document we expect it to be.

however this:

public byte[] Save()
{
    using (var memoryStream = new MemoryStream())
    {
        _wordprocessingDocument.Clone(memoryStream);
        return memoryStream.ToArray();
    }
}

returns a different byte array every time we call it.

And i can't really see why.

I saved two of these different byte arrays to disk and changed the extension to .zip.

All of the files in the zip have the same content.

Any pointers as to why the byte arrays are not exactly the same would be most welcome!

UPDATE

It must be something time related. The following code shows "false" in a console app. If you remove the Thread.Sleep() it returns "true".

 class Program
    {
        static void Main(string[] args)
        {
            using (var document = WordprocessingDocument.Create("c:\\Test.docx", WordprocessingDocumentType.Document))
            {
                document.AddMainDocumentPart();
                document.MainDocumentPart.Document = new Document();
                document.MainDocumentPart.Document.Body = new Body();
                document.MainDocumentPart.Document.Body.AppendChild(new Paragraph());

                var firstClone = Save(document);
                File.WriteAllBytes("D:\\Test1.docx", firstClone);

                Thread.Sleep(5000);

                var secondClone = Save(document);
                File.WriteAllBytes("D:\\Test2.docx", secondClone);

                Console.WriteLine(firstClone.SequenceEqual(secondClone));
                Console.ReadLine();
            }
        }

        private static byte[] Save(WordprocessingDocument wordprocessingDocument)
        {
            using (var memoryStream = new MemoryStream())
            {
                wordprocessingDocument.Clone(memoryStream);
                return memoryStream.ToArray();
            }
        }
    }

WORKAROUND

In my unit tests I'm currently comparing the FlatOpcString of the two documents. since the content is the same which is what i want to test. But the question here remains the same. Why does saving this document with 5 seconds in between cause two different byte arrays with exactly the same content.

Nick V
  • 687
  • 7
  • 21
  • Can you show any of the generation code? I just created a test that creates a file and clones it and the byte arrays generated are the same. – petelids Apr 17 '19 at 20:17
  • @petelids i've updated my question with some sample code! Thanks in advance for your help! :-) – Nick V Apr 19 '19 at 07:35

1 Answers1

1

The diff between the streams is not time related. It is the relationship id that is generated when you save the files.

I ran your program twice and renamed the saved file before the second run.

Then I compared the files using the OpenXML Productivity Tool and the only difference it showed was a relationship ID in the _rels part of the document.

enter image description here

The _rels part is a relationship part of the OpenXML file. It is a central place that stores references to other parts of the document. Rels is explained further in the free ebook OpenXML Explained.

Taterhead
  • 5,763
  • 4
  • 31
  • 40