6

Is there some utility that could be called via command line to produce a doc(x) file? The source file would be HTML and CSS.

I am trying to generate Word documents on the fly with PHP. I am only aware of phpdocx library, which is very low level and not much use for me (I already have one poor implementation of Word document generation).

What I need from a document:

  • TOC
  • Images
  • Footers/Headers (they could be manually made on each HTML page)
  • Table
  • Lists
  • Page break (able to decide what goes to which page, eg one HTML file per page, join multiple HTML files to produce the entire document.)
  • Paragraphs
  • Basic bold/etc styles
Kara
  • 6,115
  • 16
  • 50
  • 57
Tower
  • 98,741
  • 129
  • 357
  • 507
  • Since HTML/CSS and Word documents are two entirely different document models, you'd think such a utility would make it even harder to create a Word document. How do you express headers and footers in an HTML page for starters? But who knows, maybe someone will come up with something... :) – deceze Jan 04 '12 at 09:57
  • @deceze I think it's obvious that they are different. You can't place a HTML5 video in a Word document, now can you? The point was to be able to convert

    to a Word paragraph, same for

    and
  • which have their counterparts in Word. Different pages would be done with different HTML files that are joined into one document. Footers and headers could be just laid out on each page separately.
  • – Tower Jan 04 '12 at 10:00
  • @rFactor I think your best bet would be a combination of [DOM](http://php.net/manual/en/book.dom.php) and [PHPWord](http://phpword.codeplex.com/). – DaveRandom Jan 04 '12 at 10:06
  • What he's pointing at (correctly) is that one's a paged media, the other isn't. In any case, the docx format is essentially composed by XML files inside a zip archive, so you could achieve a basic transformation from HTML to that XML with XSLT. This however can't help with CSS, and it would be **hard** to implement decently. What exactly doesn't satisfy you in PHPDOCX? – Viruzzo Jan 04 '12 at 10:09
  • @rFactor I was just browsing the PHPWord home page (not been there for a while) and I came across [this](http://htmltodocx.codeplex.com/)... – DaveRandom Jan 04 '12 at 10:28
  • @rFactor Well, actually you *can* place videos in a Word document... :-3 – deceze Jan 05 '12 at 03:05

2 Answers2

7

I didn't find PHPDOCX very useful either. An alternative could be PHPWord, i think it covers what you need. According the website it can do these things:

  • Insert and format document sections
  • Insert and format Text elements
  • Insert Text breaks
  • Insert Page breaks
  • Insert and format Images and binary OLE-Objects
  • Insert and format watermarks (new)
  • Insert Header / Footer
  • Insert and format Tables
  • Insert native Titles and Table-of-contents
  • Insert and format List elements
  • Insert and format hyperlinks
  • Very simple template system (new)

In your case that isn't enough, but there is a plugin available to convert (basic) HTML to Docx and it works very good in my opinion. http://htmltodocx.codeplex.com/

I am using this for a year or two now and am happy with it. Altough i have to add that the HTML can't be to complex.

mat
  • 1,619
  • 1
  • 15
  • 25
1

The way I usually do these is to have a word document template file with the parts I want to replace using keywords (usually something like "{FIRSTNAME}"). This allows you to read the file via PHP then simply do str_replace on all the parts you want to replace, then write that to another file.

Dynamic tables using this method are a bit more tricky, as you need a sub template for a row, which you can then include inside the main template as many times as required.

I'm not sure if this is the best solution, it's always seemed very fiddly to me and every time I'm asked to do this I get frustrated with it, but I guess it works. So if anyone knows a better solution I'd love to hear it too!

Nick
  • 6,316
  • 2
  • 29
  • 47