0

i'm trying to remove a signature or a certificate from the PDF throught automation. I'm having thousands of PDF files which are signed and i'm converting them into HTML to preserve the layout when extracting text. One issue i'm facing is the signature withing the PDFs convert along with other texts so it's hard for me to parse them in the front-end.

signature example

The code i'm using for automation:

AvDoc = Dispatch("AcroExch.AVDoc")    
if AvDoc.Open(to_convert.pdf, ""):            
    pdDoc = AvDoc.GetPDDoc()
    jsObject = pdDoc.GetJSObject()
    jsObject.SaveAs(filename+ ".html", "com.adobe.acrobat.html")

The result i'm getting, when i open the HTML page: signature in the output PDF

this little signature is disrupting the entire flow when parsing text. Any suggestions?

one solution on the GUI is to right-click on the file and click on the combine option and saving it. But it isn't possible for me in the automation flow.

0 Answers0