4

I currently generate docx files programmaticaly in C#.

I would like to transform my doc file into RTF. This be done with an XSLT transform. Is there a transform that is publicly available?

I am also interested in trasnforming docx into PDF and HTML.

Geoff
  • 41
  • 2
  • I know that XSLT can be used to transform a docx file into RTF, and HTML. I have found small examples of this on various sites. Word 2007 places the items in the docx package in specific places. Does anyone know of any XSLT that will take a docx file, packaged with components named as if Word 2007 generated the document? I know how to write this from scratch, I just wondered if there is XSLT that already exists and is fairly complete in the public domain? – Geoff Sep 01 '10 at 15:19
  • 2
    I've never seen one and I've seen a lot of OpenXML converters. This is a tall order though, writing it from scratch would be painful. Maybe start with the OOXML->ODF XSLTs on [SourceForge](http://odf-converter.sourceforge.net/) – Todd Main Sep 04 '10 at 19:10
  • In what scenarios are you planning to use the converter? And did you already consider to use Word automation? – Dirk Vollmar Sep 05 '10 at 22:12

2 Answers2

0

Look at OpenXMLViewer, which can be used to transform openXML(docx) into html.

RichardLi
  • 11
  • 1
0

As suggested in one of the comments, you could use Word Automation. You are already using c#, so running and controlling Word instances is pretty easy. I have done so in the past using VB6 and Java as well. Works pretty stable, and you get high quality RTF with very minimal effort.

Other routes could include taking a long way like converting your docx to DITA or DocBook and use their toolkits to make HTML and PDF out of it. The PDF route is probably using XSL-FO. With the proper XSL-FO renderer, generating RTF out of it should be merely selecting output format RTF instead of PDF.

HTH!

grtjn
  • 20,254
  • 1
  • 24
  • 35