1

I want to convert a pdf file's each page to a new image. To do this, i use GhostScript.Net. The problem is i can't figure out why pageImage returns null in the System.Drawing.Image pageImage = rasterizer.GetPage(dpi, i); line. Here is the method i use:

 public static List<string> GetPDFPageText(Stream pdfStream, string dataPath)
    {

        try
        {
            int dpi = 100;
            GhostscriptVersionInfo lastInstalledVersion =
           GhostscriptVersionInfo.GetLastInstalledVersion(
                   GhostscriptLicense.GPL | GhostscriptLicense.AFPL,
                   GhostscriptLicense.GPL);
            List<string> textParagraphs = new List<string>();

            using (GhostscriptRasterizer rasterizer = new GhostscriptRasterizer())
            {
                rasterizer.Open(pdfStream, lastInstalledVersion,false);

                for (int i = 1; i <= rasterizer.PageCount; i++)
                {
                    // here is the problem, pageImage returns null
                    System.Drawing.Image pageImage = rasterizer.GetPage(dpi, i);

                    // rest of code is unrelated to problem..
                    
                }
            }

            return textParagraphs;
        }
        catch (Exception ex)
        {
            throw new Exception("An error occurred.");
        }
        
    }

Function parameter Stream pdfStream comes from the below code:

            using (StreamCollection streamCollection = new StreamCollection())
            {
                FileStream imageStream = new FileStream(imagePath, FileMode.Open, FileAccess.Read);
                // This is the parameter I used for "Stream pdfStream"
                FileStream pdfStream = new FileStream(pdfPath, FileMode.Open, FileAccess.Read);
                streamCollection.Streams.Add(imageStream);
                streamCollection.Streams.Add(pdfStream);
                PDFHelper.SavePDFByFilesTest(dataPath, streamCollection.Streams,mergedFilePath);
            }

I am already comfortable with the use of StreamCollection class because i used it before in a similar situation and it worked. I verified that the filepath is true and stream has the file correctly. Also i tried using MemoryStream instead of FileStream and filename instead of stream just to see if the problem is related to them or not. Is there any suggestion you could suggest? I would really appreciate that.

1 Answers1

1

Okay, i figured out why it didn't work. I use the latest version of Ghostscript (9.56.1) as K J mentioned (thank you for the response) and it uses a new PDF interpreter as default PDF interpreter. I assume it didn't work properly for some reason because it is a really new tool and still may have little problems for now. I added the following line to use good old PDF interpreter:

rasterizer.CustomSwitches.Add("-dNEWPDF=false");

Also defined resolution for produced image by following line:

rasterizer.CustomSwitches.Add("-r300x300");

Furthermore, i will share the structure of StreamCollection class, I used here as reference to implement this class. Hope it helps someone.

public class StreamCollection :  IDisposable
    {
        private bool disposedValue;
        
        public List<Stream> Streams { get; set; }

        public StreamCollection()
        {
            Streams = new List<Stream>();
        }
        
        protected virtual void Dispose(bool disposing)
        {
            if (!disposedValue)
            {
                if (disposing)
                {
                    // TODO: dispose managed state (managed objects)
                    if (this.Streams != null && this.Streams.Count>0)
                    {
                        foreach (var stream in this.Streams)
                        {
                            if (stream != null)
                                stream.Dispose();
                        }
                    }
                }

                // TODO: free unmanaged resources (unmanaged objects) and override finalizer
                // TODO: set large fields to null
                disposedValue = true;
            }
        }

        // // TODO: override finalizer only if 'Dispose(bool disposing)' has code to free unmanaged resources
        // ~StreamCollection()
        // {
        //     // Do not change this code. Put cleanup code in 'Dispose(bool disposing)' method
        //     Dispose(disposing: false);
        // }

        public void Dispose()
        {
            // Do not change this code. Put cleanup code in 'Dispose(bool disposing)' method
            Dispose(disposing: true);
            GC.SuppressFinalize(this);
        }
    }
  • 1
    You should open a bug report with the input file attached to it, and the configuration. If nobody reports errors they won't get fixed, and sooner or later the old interpreter will be gone. – KenS May 29 '22 at 15:40
  • You are right. I created an issue for this problem. You could follow the situation from [here](https://github.com/jhabjan/Ghostscript.NET/issues/101) – Mert Baykar May 29 '22 at 19:18
  • Well I actually meant a bug for Ghostscript, not Ghostscript.Net, at http://bugs.ghostscript.com. Maybe jhabjan will open a Ghostscript bug report but it would be better if you did so. The report you link to there doesn't seem to be the same thing at all, that's saying that Ghostscript.NET doesn't work with VS 2022. Are you the 'Graybeard' that asked those two questions here on Stack Overflow as well ? You haven't apparently supplied the input file to any of these questions, nor to the report at Github, nobody is going to be able to fix the problem without the input file. – KenS May 30 '22 at 08:25