0

I noticed the CAM::PDF Perl module is built to read the text from a PDF file. Right now I have CAM::PDF v1.59 installed. I've successfully used it to read text from a (v1.2) PDF file that is not password protected, but when I try to open a (v1.7) password-protected PDF file using this code...

use strict;
use warnings;
use PDF::API2;
use CAM::PDF;
use CAM::PDF::PageText;

my $file = 'C:\Users\gwilliams\Documents\PWS20130517new.pdf';

my $pdf = CAM::PDF->new($file, '-', '-', 's3cretpasswd', fault_tolerant => 1)
  or die "$CAM::PDF::errstr\n";

my $pageone_tree = $pdf->getPageContentTree(1);

print CAM::PDF::PageText->render($pageone_tree);

... I receive the error message:

Invalid xref stream: could not decode objstream 3085

The attributes of the PDF file itself are:

  • Version: 1.7
  • Security Method: Password Protection
  • Printing: Not Allowed
  • Fill In A Form: Not Allowed
  • Commenting: Not Allowed
  • Manage Pages: Not Allowed
  • Modify Document: Not Allowed
  • Content Copying: Not Allowed
  • Extract Contents: Not Allowed
  • Signing: Not Allowed

For what it's worth - it looks like the PDF in question was created with Adobe PDFMaker 10.1 for Excel.

What gives? Am I doing this right - or is the PDF incompatible with CAM::PDF?

Sincerely, Confused

Gary N.
  • 33
  • 1
  • 5
  • Documentation for CAM::PDF claims it is mostly compatible w/ PDF v1.5. PDF versions 1.6 & 1.7 both added encryption options. This may be an issue. – tjd May 29 '13 at 19:06
  • Read under Data Manipulation in the documentation. – hwnd May 29 '13 at 19:27

1 Answers1

1

I'm the author of CAM::PDF. It's probably an newer PDF feature that CAM::PDF doesn't support. I wrote the encryption support back in PDF 1.2 days and have barely updated it since. So most likely it's not your fault but is a limitation of the library.

Chris Dolan
  • 8,905
  • 2
  • 35
  • 73
  • OK. Good to know - direct from the source at that. I wove in pdftotext.exe (reluctantly) to handle the conversion, but it'd obviously be a lot cooler to handle that part from within Perl exclusively. – Gary N. May 30 '13 at 17:33