0

I am having an issue when writing from one memory stream to another. I am using a NuGet package to convert PDFs to pngs. I have a need to save the images as base64 string. When I read the pdf in, it properly creates the pdf object with the correct number of expected pages. After I save the pdf to the memory stream, that stream has a length (presumably correct, but trying to create test validation now). After I send the stream to where it should be converting via copying to the other stream, the other stream never has any data. I tried the two approaches below, one more involved and one short-and-sweet based of threads I've found on here.

I cannot get my memory streams to write to one another.

This is my class

class pdf
{
    string localPath = @"C:\_Temp\MyForm.pdf";

    public pdf()
    {
        var base64String = GenerateSampleFormBase64(localPath);

        using(StreamWriter sw = new StreamWriter(@"C:\_Temp\log.txt"))
        {
            sw.WriteLine(base64String);
            sw.Flush();
        }

    }

    private static string GenerateSampleFormBase64(string path)
    {
        PdfDocument pdf = new PdfDocument(path);

        MemoryStream msPdf = new MemoryStream();
        pdf.Save(msPdf);

        var x = ConvertPdfPageToPng(msPdf);
        return Convert.ToBase64String(x);

    }

    static byte[] ConvertPdfPageToPng(MemoryStream msPng64)
    {
        // msPng64 length is 473923

        string base64;
        using(MemoryStream msPng = new MemoryStream(100))
        {
            byte[] buffer = new byte[1024];
            int bytesRead;
            while((bytesRead = msPng64.Read(buffer, 0, buffer.Length)) > 0)
            {
                msPng.Write(buffer, 0, bytesRead);
            }

            // msPng is always length 0
            base64 = Convert.ToBase64String(msPng.GetBuffer(), 0, (int)msPng.Length);
            byte[] raw = Convert.FromBase64String(base64);

            if(raw.Length > 0)
                return raw;
            else
                throw new Exception("Failed to write to memory stream.");
        }
    }

    // This also did not work
    public static string ConvertToBase64(MemoryStream stream)
    {
        byte[] bytes;
        using(var memoryStream = new MemoryStream(100))
        {
            stream.CopyTo(memoryStream);
            bytes = memoryStream.ToArray();
        }

        return Convert.ToBase64String(bytes);
    }
}
Goodies
  • 1,951
  • 21
  • 26
Nyra
  • 859
  • 1
  • 7
  • 27
  • 2
    You never reset the position in the memorystream back to the beginning. It never "knows" when you've stopped working at its end and you want to start working from its start. – Damien_The_Unbeliever Feb 11 '21 at 15:26
  • 1
    msPdf.Position = 0; after the Save() call. – Hans Passant Feb 11 '21 at 15:26
  • thank you both so much!!! I've been scratching my head for hours x_x – Nyra Feb 11 '21 at 15:28
  • also, why not just use [.CopyTo()](https://learn.microsoft.com/en-us/dotnet/api/system.io.memorystream.copyto?view=net-5.0)? – JonasH Feb 11 '21 at 15:42
  • NOTE: above constructor call new PdfDocument(path); creates a new empty PDF file, saving that file causes localPath file to get lost (0 bytes) use instead: var pdf = PdfReader.Open(localPath); – Goodies Feb 11 '21 at 16:23
  • @JonasH the CopyTo I had also tried (method remnants at towards bottom). I couldn't get either to work. – Nyra Feb 11 '21 at 16:46
  • @Goodies this is for local testing only, implementation this pdf is coming as an encoded 64 string passed as a post body. I create the pdf from a MemoryStream as the constructor instead of a file path. Would changing it to var pdf in that circumstance still be better, or is it ok to leave as PdfDocument pdf - ..? – Nyra Feb 11 '21 at 16:48
  • @Nyra I've played with above code thursday and it just occurred to me that it wiped out my PDF, that's all.. the code in your question is a risk to invoke. Please take my advice and use PdfReader.Open() instead, or provide some sample content before saving the memorystream over your PDF. In above code the memorystream will always be empty - it has just been created - so nothing is actually tested. – Goodies Feb 13 '21 at 21:18
  • 1
    @Goodies, I'm still not 100% on what you are saying. After looking some, I think you are referring to PdfReader which seems to be PDFSharp correct? The NuGet package I am using is Docotics, and it does not have an open method, it accepts a string for a local path (which I was using for testing only), but it also accepts a stream. I am already reading the data in via stream, so my constructor is not of a file path, it is of a stream PdfDocument pdf = new PdfDocument(Stream stream). This has very similar syntax to the PDFSharp tool, but does not have the same methods. – Nyra Feb 17 '21 at 20:35
  • 1
    @Nyra this explains it.. I tested with PDFSharp indeed. In that case it wipes the PDF. I have edited your question to include the docotics.pdf tag. This docotics platform is quite exotic, according to the SO choice list there are only 4 questions with that tag. Please make sure you add tags in future questions. – Goodies Feb 18 '21 at 13:12

1 Answers1

0

You use PdfDocument.Save() method but it saves PDF, not PNG. You need to use PdfPage.Save instead. This sample code shows how to generate a PNG image in Base64 for the first page of a PDF document:

using (var pdf = new PdfDocument(path))
{
    PdfDrawOptions options = PdfDrawOptions.Create();
    options.BackgroundColor = new PdfRgbColor(255, 255, 255);
    options.Compression = ImageCompressionOptions.CreatePng();

    string base64 = ConvertPdfPageToBase64Png(pdf.Pages[0], options);
    File.WriteAllText("page0_base64.txt", base64);
}

private static string ConvertPdfPageToBase64Png(PdfPage page, PdfDrawOptions options)
{
    using (var stream = new MemoryStream())
    {
        page.Save(stream, options);
        return Convert.ToBase64String(stream.GetBuffer(), 0, (int)stream.Length);
    }
}
Vitaliy Shibaev
  • 1,420
  • 10
  • 24
  • I think you are thinking as same convo in the thread above. This is using NuGet pacakge Docotic not PDFSharp. The issue was I had to set the position = 0. – Nyra Feb 18 '21 at 21:15
  • 1
    I am developer of Docotic.Pdf and I wrote exactly about this package :-) Your code in the topic saves PDF to a memory stream (as PDF, not PNG), then copies this stream to another memory stream, then converts the second stream to base64 string, then converts this base64 string to byte array, and finally converts the byte array to base64 again. Pretty wordy, and, moreover, your code does not produce PNG. – Vitaliy Shibaev Feb 19 '21 at 02:50
  • I guess my naming conventions are off for this example, I had copied over from base project. I have a loop through the array of page like this for(int i = 0; i < pdf.Pages.Count; i++) { MemoryStream msPng = new MemoryStream(); pdf.Pages[i].Save(msPng, options); msPng.Position = 0; .....} – Nyra Feb 19 '21 at 18:30