3

I have to do some automation of converting Word documents to PDF. By doing some research, I found that starting from Microsoft Office 2007, Word documents are XML based. Furthermore, I found that there is a free solution ApacheFOP doing conversion from XML to PDF, however, I still didn't manage to find the way to automate it with C#. There is nFOP (version that runs on the .NET framework), but some detailed explanation of implementing it, not really.

Deduplicator
  • 44,692
  • 7
  • 66
  • 118
BeeCoding
  • 99
  • 1
  • 2
  • 9
  • Microsoft Word XML is vastly different from the XML that ApacheFOP converts to PDF (xsl-fo). This question addresses the conversion of Word XML to xsl-fo: http://stackoverflow.com/questions/17029603/xslt-xml-word-to-xsl-fo-pdf – Frank Rem May 22 '14 at 09:39
  • In other words, if I want to use ApacheFOP, first I have to transform Word document into the XSL Formatting Objects (XSL-FO) format, and from there by using ApacheFOP I can convert it to PDF, right? – BeeCoding May 22 '14 at 09:53
  • Yes. I haven't done this but it seems like a path worth trying. – Frank Rem May 22 '14 at 09:56
  • Okay, let me give a try. Thank you Frank! :) – BeeCoding May 22 '14 at 09:59

3 Answers3

2

You could use docx4j.NET

That's a .NET version of docx4j, which is a Java library which converts docx to PDF using FOP.

See ConvertOutPDF.java

Before you go to the effort of downloading etc, you might want to use the online demo to see whether the PDF output is close to your needs.

**Disclosure: I lead the docx4j project. **

JasonPlutext
  • 15,352
  • 4
  • 44
  • 84
  • This is a solution worth noting, but using this demo site, PDF output doesn't satisfy my needs. – BeeCoding May 23 '14 at 06:20
  • OK, well - assuming a large gap - you may as well forget about using FOP; the effort to get the output you want will be too much. Instead, see whether LibreOffice/OpenOffice output is good enough for your needs. – JasonPlutext May 23 '14 at 08:20
  • Ok. What I found is PDFCreator, it's free and output PDF is actually very good. It seems worth trying. Are you maybe familiar with it? – BeeCoding May 23 '14 at 08:59
  • 1
    If that's the virtual printer one, it depends on having Word (or OO/LO) installed, in which case query what you gain over the native PDF output support in Word/OO/LO. – JasonPlutext May 23 '14 at 09:25
  • Thank you so much, I finally got the whole picture. – BeeCoding May 23 '14 at 10:03
0

An ugly solution would be to make a "save as" using microsoft office interop...

Read more here

And find the related stackoverflow post here

Community
  • 1
  • 1
gerodim
  • 131
  • 8
  • Yes, but in order to use it, you need to have MS Office installed, I'm trying to go with some free solution. – BeeCoding May 22 '14 at 09:37
  • I think a paid library is the way to go - or you could try to create your own library wrapping things around ITextSharp... – gerodim May 22 '14 at 09:39
  • I found ITextSharp earlier, and it is free, however in order to use it within some commercial app, you have to publish your source code, therefore I kept searching. – BeeCoding May 22 '14 at 09:58
0

I have found one library that can convert XML to PDF in C#/.NET and vice versa known as Aspose.PDF for .NET . I hope it will solve your problem.