10

I am using Microsoft Word 2007. I would like to convert the Word document to XSL-FO. There are some hints on the net, but only for RenderX. Is there such a tool for Apache FOP?

Ilmari Karonen
  • 49,047
  • 9
  • 93
  • 153
Thunder
  • 10,366
  • 25
  • 84
  • 114
  • XSL-FO is pretty standardized, and apart form some table stuff FOP should render it ok. Do you get error messages when trying to render the XSL-FO with Apache FOP? – chiborg Sep 17 '10 at 10:22
  • 1
    In case you intend to use xsl-fo to produce pdf output, be aware that word 2007 and word2010 have the capability to "save as..." pdf. – Dimitre Novatchev Sep 17 '10 at 13:44

5 Answers5

8

RenderX has a set of freely available XSLT Stylesheets for converting Microsoft's WordprocessingML documents to XSL FO (XSLFO)

These publicly available stylesheets can be used to convert Microsoft's WordprocessingML documents to XSL FO (XSLFO)

You don't have to use the generated XSL-FO with RenderX. You can run them to produce XSL-FO output and render in any XSL-FO engine (to include Apache FOP).

Antennahouse also has a WordMLToFO Stylesheet, but it is not free ($200)

Mads Hansen
  • 63,927
  • 12
  • 112
  • 147
2

docx4j uses FOP to create pdfs from docx.

The XSLT is in here, but you may prefer to start with this webapp, which can emit XSLFO from an uploaded docx.

It uses extension functions to do the heavy lifing, so it only really works as part of docx4j, but that's readily availableand ASLv2 licensed.

Yes, RenderX have their http://www.renderx.com/tools/word2fo.html but the licence is restrictive, and the 20070227 version is directed at Word 2003 WordML (maybe there is a newer one? its been a while since i looked)

JasonPlutext
  • 15,352
  • 4
  • 44
  • 84
  • How do I apply the XSLT to a Word document? It internally contains several Xml files, which one I should use? – hardywang Jun 28 '13 at 18:52
1

Word can do it on it's own. Here are Microsoft's instructions: http://msdn.microsoft.com/en-us/library/office/aa537167%28v=office.11%29.aspx#officewordwordmltoxsl-fo_creatinganxslfodocumentfromword

Here is the download link for the required XSL - Word2FO.xsl: http://www.microsoft.com/en-us/download/details.aspx?id=16876

Phil
  • 2,238
  • 2
  • 23
  • 34
0

If you want DOCX (Word 2007) support you have to decompress the file and merge the individual resources in order to use the stylesheets. And that is half of the problem, because last time I checked there were severe limitations in the stylesheets like handling of styles/themes, continued sections and so on. If you can afford it, a commercial DOCX to PDF engine may be what you need. One important thing to remember is that passing through XSL-FO is really not feasible as XSL-FO doesn't provide support for tabs, tight wrapping of text around images or other Word features.

Peter Stroll
  • 121
  • 7
0

I created a while back a reporting tool that alters RenderX XSLT, that originally converts wordML 2003 to XSLFO, to do wordML 2003 -> XSLT, that later is merged with XML data to ultimately generate XSLFO with template + data. You needed to create your template in word, then import the generated XSLT to the web app and run the query that generates the XML and merges it with your provided template.

https://github.com/juanmf/neatReports

Documentation

https://github.com/juanmf/neatReports/blob/master/doc/HowToReport.pdf

juanmf
  • 2,002
  • 2
  • 26
  • 28