0

I'm using GhostScript (currently 9.27) to reduce the size of PDF files on my application before uploading them to a file server. The issue I'm facing is that some PDF files are converted to a blank PDF file, however, if I open the original PDF file with Adobe Acrobat and save it and then execute my GhostScript rutine it runs fine, the PDF is displayed and is correctly "compressed" (reduced quality).

I've tried different PDF settings, however the desired one is /ebook, so I would like to make it work with ebook quality. I'm using a GhostScript Wrapper (gonna post the code here) and the function I'm calling is:

RunGS("-dQUIET", "-dBATCH", "-dNOPAUSE", "-dNOGC", "-dPDFSETTINGS=/ebook", , "-sDEVICE=pdfwrite", "-sOutputFile=" & OUTPUT_FILE, INPUT_FILE)

It takes longer than usual when the final result is a blank PDF file and it returns this error:

I've just noticed I was getting an error callback... it says:

GhostScriptUnrecoverable error, exit code -100

This is the non working file (original): https://docdro.id/YuZslRm

And this the file after begin saved with Acrobat, which works fine: https://docdro.id/cAoUCS5

Here the wrapper, just in case:

Module GhostscriptDllLib

Private Declare Function gsapi_new_instance Lib "gsdll32.dll" _
  (ByRef instance As IntPtr, _
  ByVal caller_handle As IntPtr) As Integer

Private Declare Function gsapi_set_stdio Lib "gsdll32.dll" _
  (ByVal instance As IntPtr, _
  ByVal gsdll_stdin As StdIOCallBack, _
  ByVal gsdll_stdout As StdIOCallBack, _
  ByVal gsdll_stderr As StdIOCallBack) As Integer

Private Declare Function gsapi_init_with_args Lib "gsdll32.dll" _
  (ByVal instance As IntPtr, _
  ByVal argc As Integer, _
  <MarshalAs(UnmanagedType.LPArray, ArraySubType:=UnmanagedType.LPStr)> _
  ByVal argv() As String) As Integer

Private Declare Function gsapi_exit Lib "gsdll32.dll" _
  (ByVal instance As IntPtr) As Integer

Private Declare Sub gsapi_delete_instance Lib "gsdll32.dll" _
  (ByVal instance As IntPtr)

'--- Run Ghostscript with specified arguments

Public Function RunGS(ByVal ParamArray Args() As String) As Boolean

    Dim InstanceHndl As IntPtr
    Dim NumArgs As Integer
    Dim StdErrCallback As StdIOCallBack
    Dim StdInCallback As StdIOCallBack
    Dim StdOutCallback As StdIOCallBack

    NumArgs = Args.Count

    StdInCallback = AddressOf InOutErrCallBack
    StdOutCallback = AddressOf InOutErrCallBack
    StdErrCallback = AddressOf InOutErrCallBack

    '--- Shift arguments to begin at index 1 (Ghostscript requirement)

    ReDim Preserve Args(NumArgs)
    System.Array.Copy(Args, 0, Args, 1, NumArgs)

    '--- Start a new Ghostscript instance

    If gsapi_new_instance(InstanceHndl, 0) <> 0 Then
        Return False
        Exit Function
    End If

    '--- Set up dummy callbacks

    gsapi_set_stdio(InstanceHndl, StdInCallback, StdOutCallback, StdErrCallback)

    '--- Run Ghostscript using specified arguments

    gsapi_init_with_args(InstanceHndl, NumArgs + 1, Args)

    '--- Exit Ghostscript

    gsapi_exit(InstanceHndl)

    '--- Delete instance

    gsapi_delete_instance(InstanceHndl)

    Return True

End Function

'--- Delegate function for callbacks

Private Delegate Function StdIOCallBack(ByVal handle As IntPtr, _
  ByVal Strz As IntPtr, ByVal Bytes As Integer) As Integer

'--- Dummy callback for standard input, standard output, and errors

Private Function InOutErrCallBack(ByVal handle As IntPtr, _
  ByVal Strz As IntPtr, ByVal Bytes As Integer) As Integer

    Dim objString As String
    objString = Marshal.PtrToStringAnsi(Strz, Bytes)       
    Return 0

End Function

Any ideas about how to avoid this? I wouldn't mind to take an express process or something else. As I said this only happens with some specific files (we get them from our customers), but probably 98% of them are size reduced correctly.

BenMorel
  • 34,448
  • 50
  • 182
  • 322
  • You are going to have to supply a **complete** PDF file which exhibits the problem, it doesn't have to be here, you can put it somewhere else and post a link. The 'header' of a PDF file is just the first line (optionally a second line of commented binary), in your case that's the %PDF-1.3 line. The rest of it is the body of the PDF file. If you've got a problem, then you should really open a bug report (bugs.ghostscript.com). Can you reproduce this using the Ghostscript command line executable ? If not then it may be something you're doing. If yes, then you'd be better to open a bug. – KenS Aug 29 '19 at 10:02
  • Thanks for your suggestion. I've edited the post with 2 links added. I'll try to learn how to execute it with the command line an inform here too. – Marçal Torroella Aug 29 '19 at 10:19

1 Answers1

2

OK so you say 'it doesn't prompt any error', however when I run your file here Ghostscript starts by saying:

**** Warning: Discovered more entries in xref than declared in trailer /Size
   **** Warning:  File has an invalid xref entry:  2.  Rebuilding xref table.

And then on every page says:

   **** Error: stream operator isn't terminated by valid EOL.
               Output may be incorrect.
   **** Error: stream operator isn't terminated by valid EOL.
               Output may be incorrect.

and ends up with:

   **** This file had errors that were repaired or ignored.
   **** The file was produced by:
   **** >>>>  <<<<
   **** Please notify the author of the software that produced this
   **** file that it does not conform to Adobe's published PDF
   **** specification.

   **** The rendered output from this file may be incorrect.

Which I would have said was a fairly large number of errors. Note that when you save the file from Acrobat it will, naturally, fix these syntax problems, so of course Ghostscript will then not complain, as the saved file is valid.

That said, using a command line based on yours:

"c:\program files\gs\gs9.27\bin\gswin64c" -sDEVICE=pdfwrite -sOutputFile=out.pdf -dBATCH -dNOPAUSE -dNOGC -dPDFSETTINGS=/ebook 20194114_EXPORT_DOCS_Original.pdf

produces fewer warnings, because you've specified -dQUIET. If you're trying to investigate a problem, then suppressing warnings is probably not ideal. Are you seeing any of the back channel output from Ghostscript ? If so you should post it here as well. If not, then you need to implement code to capture it, its important information.

NB don't use -dNOGC that's a debugging only switch. I know, people keep posting it as part of their command line, usually because they 'researched it' (found it on Google). Don't use it.

Anyway, with that command line I get a PDF file which looks reasonable and is 20% the size of the original.

Using your command line (or something as close to it as I can) doesn't reproduce the problem for me (either on 32-bit or 64-bit, using current code or the 9.27 release) so I can only speculate as to problems. If you had set -dPDFSTOPONERROR that would exit immediately on reading the file (with a lengthy error message), and would produce an empty PDF file. I can't think of any other way you could get that, especially 'with no error'.

FWIW by default Ghostscript attempts to repair invaliid PDF files, or at least ignore errors as far as possible. The PDFSTOPONERROR switch is intended for use in commercial environments where its important that files which might not render correctly are flagged and checked/rejected/repaired rather than being wastefully printed.

On which note; I notice that you appear to be using Ghostscript commercially, and are linking to the DLL. I feel I should point you to the licence under which Ghostscript is supplied (AGPL v3), you should probably check that your usage is valid under the terms of that license.

KenS
  • 30,202
  • 3
  • 34
  • 51
  • Thank you so much! So many good info there. When I said that I had no errors I was refering to the .net wrapper, but probably it doesn't give me the necessary information, as you said, I need to capture it in some way as I don't have any feedback right now. Using the console prompts the errors you mentioned (including that the file may not be readable) however, as you said, it open correctly (with quite good quality and reducing from 2.5MB to 400KB). I tried the PDFSTOPONERROR parameter but it doesn't seem to affect. I may need to implement a new wrapper or investigate further. – Marçal Torroella Aug 29 '19 at 13:36
  • I've just noticed I was getting an error callback... it says: GhostScriptGPL Ghostscript GhostScript9.27 GhostScript: GhostScriptUnrecoverable error, exit code -100 – Marçal Torroella Aug 29 '19 at 14:19
  • 1
    -100 is gs_error_Fatal, it just means 'something really bad happened, not sure what but its terminal'. If you want to see the back-channel information (which is actually useful when there's a problem, so I'd reccomend it) you need to implement the stdout and stderr callbacks. For me -dPDFSTOPONERROR throws a normal PostScript error and exits. I can't think how your wrapper is ending up with an emptry file, but error -100 is a good bet, the pdfwrite device will close teh output file when there's an error, if nothgin has happened yet, then it'll be empty (like -dPDFSTOPONERROR) – KenS Aug 29 '19 at 14:37
  • As this is a very ocasional error, I've ended by evaluating the StdIOCallBack and if it prompts an error I just skip the "compression" and put the original file to the database. Is much better to keep the integrity of the file rather gain some space. Thanks for your knowledge and help ! – Marçal Torroella Aug 29 '19 at 15:11