0

I have a function right now that converts a docx(in bytes[] format) to a pdf(in bytes[] format) using Microsoft.Office.Interop.Word

And it works great. Except for the fact that it doesn't work online since it requires WinOffice to be installed on the server which I cannot do anything about.

So I need to go to something else and I'm thinking about openXML(Unless you know any better ways).

But how exactly would I go around this? I just want to take this docx file, convert and return it as a pdf in bytes[] format.

My previous code in Microsoft.Office looks like this

public static byte[] ConvertDocx2PDF(byte[] DocxFile, string FileName)
{
    try
    {
        string path = Path.Combine(HttpRuntime.AppDomainAppPath, "MailFiles/DOCX2PDF");

        if (!Directory.Exists(path))
            Directory.CreateDirectory(path);

        Guid id = Guid.NewGuid();

        FileName = id.ToString() + FileName;

        path += "/" + FileName;



        if (File.Exists(path))
            File.Delete(path);

        File.WriteAllBytes(path, DocxFile);

        Microsoft.Office.Interop.Word.Application word = new Microsoft.Office.Interop.Word.Application();

        object oMissing = System.Reflection.Missing.Value;

        word.Visible = false;
        word.ScreenUpdating = false;

        // Cast as Object for word Open method
        Object filename = (Object)path;
        // Use the dummy value as a placeholder for optional arguments
        Microsoft.Office.Interop.Word.Document doc = word.Documents.Open(ref filename, ref oMissing,
            ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing,
            ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing,
            ref oMissing, ref oMissing, ref oMissing, ref oMissing);
        doc.Activate();
        object outputFileName = (object)path.ToLower().Replace(".docx", ".pdf");
        object fileFormat = Microsoft.Office.Interop.Word.WdSaveFormat.wdFormatPDF;

        if (File.Exists(outputFileName.ToString()))
            File.Delete(outputFileName.ToString());

        // Save document into PDF Format
        doc.SaveAs(ref outputFileName,
            ref fileFormat, ref oMissing, ref oMissing,
            ref oMissing, ref oMissing, ref oMissing, ref oMissing,
            ref oMissing, ref oMissing, ref oMissing, ref oMissing,
            ref oMissing, ref oMissing, ref oMissing, ref oMissing);

        object saveChanges = Microsoft.Office.Interop.Word.WdSaveOptions.wdDoNotSaveChanges;
        ((Microsoft.Office.Interop.Word._Document)doc).Close(ref saveChanges, ref oMissing, ref oMissing);
        doc = null;

        ((Microsoft.Office.Interop.Word._Application)word).Quit(ref oMissing, ref oMissing, ref oMissing);
        word = null;

        try
        {
            File.Delete(path);
        }
        catch { }

        return File.ReadAllBytes(path.ToLower().Replace(".docx", ".pdf"));
    }
    catch (Exception e)
    {

    }
    byte[] erroByte = new byte[0];
    return erroByte;
}

As said. It works great but doesn't work on my server.

Any idea how to do this in openXML or any other?

Thank you for your time

HenrikP
  • 844
  • 5
  • 16
  • 37

2 Answers2

1

You can use OpenXmlSdk and OpenXML power tools to convert docx to html and then you can convert your html to pdf. Here no interop required. And finally you can use WkHtmlToPDF as a dll to create pdf from Html. The pdf rendering in web browser. This worked for me.

Links:

OpenXml Docx to Html

Docx to Html using XSLT

Hope this helps!

Mohamed Alikhan
  • 1,315
  • 11
  • 14
0

docx is a document description format, whereas you can think of pdf as a vector graphic format. Even though it tries pretty hard to masquerade as a document format, it is inherently a graphic format.

What does this mean? It means a proper conversion would require to render the document. Basically, you'd have to reimplement a core part of MS Word to make it reliable.

I guess there are some libraries that exist, but it will cost you much more than getting a server where you'll be able to just install a copy of Word.

But after all, OpenOffice can render word documents, so perhaps someone could try to embed it into a (gargantuan) library...

EDIT: Actually, I found this answer, which may be helpful, but it says it requires OpenOffice to be installed. Perhaps it could work with an xcopied OOo, you could try it out.

Community
  • 1
  • 1
Lucas Trzesniewski
  • 50,214
  • 11
  • 107
  • 158