-2

Here's my situation. I have an equipment test platform that runs tests (Selenium, Perl, and a lot of custom code). The results are currently output into an HTML file which any browser can display. It's one long page with many large tables (full page width and multipage, but a row is smaller than a page). The problem is that this page is not printable all that well. I haven't found a browser yet that obeys all the CSS3 directives, particularly starting headers on odd pages (works on IE10+ only) and not splitting table rows across pages. I'd also like to expand the Table of Contents with leader dots and auto-fill the page numbers. Header page numbers and security footers work OK. If there was a browser that cleanly printed from HTML, that would be satisfactory.

Anyway, my customers are asking if they could get the output in PDF format. Looking around, I see a few HTML-to-PDF converters, but either they don't work that well, are expensive, require a server installation (for PHP), or are on someone else's site (we're concerned about sending our proprietary information offsite to convert). The resulting PDF should be fairly straightforward: no input or forms, no encryption, a few PNG or JPEG images, a few text colors, and a lot of tables and text. All page links are within the document -- it is OK to lose them. There are just a few "standard" fonts. Can anyone suggest some options? A command-line interface is fine (in fact, preferred, so it can be run from a script). I tried direct PDF output from Chrome, and the results were unsatisfactory (can I configure page size, scale, etc.?).

I'm open to producing the PDF from the original text and data (i.e., not going through an HTML file as an intermediate form). A library that would work with my current Perl code would be preferred, to minimize rewriting (how good is PDF::Create?). I see some PHP-based libraries, but is it possible to have PHP without a server? (server installation is not allowed here) Very importantly, the whole point of outputting to HTML in the first place was that I didn't have to count characters and line lengths and fit to pages -- let a browser take care of all that. Is there a library to output PDF that takes care of all that clerical stuff for me?

Whatever the solution, it needs to run on Windows at a minimum (Linux is fine, too). Thanks much for suggestions!

Phil Perry
  • 2,126
  • 14
  • 18

1 Answers1

1

Doing a quick check on CPAN it looks like both PDF::Create and PDF::API2 should do what you are trying to do though both modules seem be provide somewhat low level access to PDF function that you may find difficult to use unless you have a specific idea as to how you want your page layout done. Fortunately, PDF::API2 has a number of helper modules that make things like table layouts and automatic text flow easier to deal with. Here's some related StackExchange links on the subject (with sample code in responses) you might find helpful...

What's the best method to generate Multi-Page PDFs with Perl and PDF::API2?

How can I make PDF tables from Perl?

How to add header, footer with images using PDF::API2::Lite?

As for working on Windows and UNIX, I just did a quick check and installed PDF::API2 using the CPAN installer on my desktop (ActivePerl 5.16.3 MSWin32-x86-multi-thread), so in threory, it should work on Windows for you as well, assuming you are allowed to install the required modules someplace.

Your mileage may vary.

Community
  • 1
  • 1
Rick Sarvas
  • 769
  • 10
  • 20
  • Thanks, Rick. I've been playing with PDF::API2 and PDF::Table today, and they don't seem to be making nice with each other. Maybe I'm using it wrong, but Table doesn't seem to be passing information such as the current page and location to API2. For example, I've basically merged their two demos (SYNOPSIS sections), but the text comes out on page 1 rather than the current page (3), and is one long line. I'll look at your links and see if they offer any clues. I was unable to install PDF::API2::Simple, which offers "autoflow" text and x & y position. – Phil Perry Jan 07 '14 at 00:02
  • I'm not sure about PDF::API2 and PDF::Table playing nice together but I think I see the problem with PDF::API2::Simple. This is one of those modules that requires Module::Install (and dependencies) to be installed first, and that can be a pain on Windows. An alternative is to go into the CPAN temp build directory for PDF::API2::Simple ("C:\Tools\Perl\cpan\build\PDF-API2-Simple-1.1.4-oBKMNB\lib\PDF\API2" on my system) and manually copy the "Simple.pm" file to your Perl\site\lib\PDF\API2 directory. As this module does not have a binary dependency this should be OK, though this is not ideal. – Rick Sarvas Jan 07 '14 at 05:47
  • I installed Module::Install, but install PDF::API2::Simple fails very early on a dmake test of some sort (dmake _does_ appear to be installed). Following your manual copy instructions, I now have Simple working. I have to get back to doing some "real" work now :(, but I'll try to get back to fiddling with this soon. I would still appreciate anyone's coming up with some good suggestions for getting to PDF from either HTML or generated from scratch, so I can avoid reinventing the wheel. Thanks for your time! – Phil Perry Jan 07 '14 at 16:26
  • I think that PDF::API2::Simple is probably enough to do most of what I need (thanks Rick, for the info on how to get it working). Still some questions: 1) Is there a way to avoid doing two passes to fill in forward references, such as the TOC page numbers and "Page M of N" on each page? Is it possible to get to the end and then somehow jump back to earlier pages and update them? 2) Is the pages method only available after closing the output PDF file? 3) What is the current_page method? It returns an object -- what useful is in there? I can track the current page in other ways. (cont) – Phil Perry Jan 07 '14 at 23:35
  • (cont) 4) If you change a text attribute (font, size, color, etc.), it is a global change (permanent). Is there any way to localize it to one text() method call? If not, I will look at adding a new parameter to localize it. 5) The text() method leaves x,y at the beginning of the next line. Is there any way to leave x,y at the next point in line (e.g., to change to italic font for one word, and then resume)? I have some ideas about a new parameter to deal with this. Finally, I will look at integrating PDF::Table (or code inspired by it) into Simple, and offer it to RedTree for public use. – Phil Perry Jan 07 '14 at 23:40