I have to do some automation of converting Word documents to PDF. By doing some research, I found that starting from Microsoft Office 2007, Word documents are XML based. Furthermore, I found that there is a free solution ApacheFOP doing conversion from XML to PDF, however, I still didn't manage to find the way to automate it with C#. There is nFOP (version that runs on the .NET framework), but some detailed explanation of implementing it, not really.
Asked
Active
Viewed 6,730 times
3
-
Microsoft Word XML is vastly different from the XML that ApacheFOP converts to PDF (xsl-fo). This question addresses the conversion of Word XML to xsl-fo: http://stackoverflow.com/questions/17029603/xslt-xml-word-to-xsl-fo-pdf – Frank Rem May 22 '14 at 09:39
-
In other words, if I want to use ApacheFOP, first I have to transform Word document into the XSL Formatting Objects (XSL-FO) format, and from there by using ApacheFOP I can convert it to PDF, right? – BeeCoding May 22 '14 at 09:53
-
Yes. I haven't done this but it seems like a path worth trying. – Frank Rem May 22 '14 at 09:56
-
Okay, let me give a try. Thank you Frank! :) – BeeCoding May 22 '14 at 09:59
3 Answers
2
You could use docx4j.NET
That's a .NET version of docx4j, which is a Java library which converts docx to PDF using FOP.
Before you go to the effort of downloading etc, you might want to use the online demo to see whether the PDF output is close to your needs.
**Disclosure: I lead the docx4j project. **

JasonPlutext
- 15,352
- 4
- 44
- 84
-
This is a solution worth noting, but using this demo site, PDF output doesn't satisfy my needs. – BeeCoding May 23 '14 at 06:20
-
OK, well - assuming a large gap - you may as well forget about using FOP; the effort to get the output you want will be too much. Instead, see whether LibreOffice/OpenOffice output is good enough for your needs. – JasonPlutext May 23 '14 at 08:20
-
Ok. What I found is PDFCreator, it's free and output PDF is actually very good. It seems worth trying. Are you maybe familiar with it? – BeeCoding May 23 '14 at 08:59
-
1If that's the virtual printer one, it depends on having Word (or OO/LO) installed, in which case query what you gain over the native PDF output support in Word/OO/LO. – JasonPlutext May 23 '14 at 09:25
-
0
-
Yes, but in order to use it, you need to have MS Office installed, I'm trying to go with some free solution. – BeeCoding May 22 '14 at 09:37
-
I think a paid library is the way to go - or you could try to create your own library wrapping things around ITextSharp... – gerodim May 22 '14 at 09:39
-
I found ITextSharp earlier, and it is free, however in order to use it within some commercial app, you have to publish your source code, therefore I kept searching. – BeeCoding May 22 '14 at 09:58
0
I have found one library that can convert XML to PDF in C#/.NET and vice versa known as Aspose.PDF for .NET . I hope it will solve your problem.

Hudson Lane
- 9
- 1
-
Aspose.pdf is a very useful tool, however it is commercial, I'm going for some free solution. – BeeCoding May 23 '14 at 05:36