0

I am trying out ApprovalTests.NET to test some (mostly legacy) PDF generation code build on MigraDoc. The code under test renders the PDF to a MemoryStream, which is then scrubbed of various metadata properties (using code I adapted from PdfScrubber), converted to a byte array and verified using Approvals.VerifyBinaryFile().

The tests pass on my machine and that of a colleague (both running Windows 10), but fail on our TeamCity build agent (an Azure VM, running Windows Server 2012 R2 I think). Comparing the Received file (generated on the build server) and the Approved file (generated on my machine), the metadata portions of the files are identical, but the binary portions are totally different, with one file being about 1 kb shorter than the other.

What might be causing the discrepancy? Is it likely to be OS-related?

Edit

The issue appears to be fonts (thanks to PDFSharp Expert for the suggestion). On closer inspection there are two binary objects that differ, and these apparently define the header and body fonts: when I remove one and then the other, the headers and the body text respectively turn to dots.

So, is there a way to guarantee that all machines will produce the same output with regard to fonts? So far I've tried:

  • passing PdfFontEmbedding.None to the constructor of PdfDocumentRenderer (previously it was using PdfFontEmbedding.Always)
  • setting a private font like so:

        var fonts = new XPrivateFontCollection();
        var arial = File.ReadAllBytes("path/to/arial/copied/from/windows/server.ttf");
        fonts.AddFont(arial, "Arial");
        XPrivateFontCollection.SetGlobalFontCollection(fonts);
    

In both cases I'm getting the same output as before on my local machine.

Blisco
  • 601
  • 6
  • 14
  • If the team city build box machine produces a file visually like what you expect then you could look a the metadata at the pdf level of the two docs and look for version difs. For example the Pdf Archive specification flag produces different content then newer pdf format versions. So look at the pdf level metadata to see if they are targeting the exact same PDF version. – Sql Surfer Aug 07 '16 at 21:40
  • For "Arial" you have to add up to four TTF files to the XPrivateFontCollection. If you use bold and/or italics then adding just the regular file won't be enough. A small mismatch somewhere might stop PDFsharp from using the private font. With a "like so" code snippet we cannot spot such issues. – I liked the old Stack Overflow Aug 11 '16 at 08:31

1 Answers1

1

Without seeing the PDF files in question I can only speculate.

The TTF files used for the PDF may vary with the OS and this can affect the file size.

Non-JPEG images are read using framework/OS code, so size differences might also come from images.

PDF files contain many objects. PDFsharp can create verbose PDF files that are somewhat human readable (this is the default for DEBUG mode). Run the tests with a DEBUG build and compare the PDF files to see which objects contribute to the size difference.