I am investigating the effort it would take to render a Microsoft Office XML file (.docx, in this case) to an image programmatically. For illustration, I want to achieve something similar to Apple's QuickLook preview for said file. Requirements:
- Must be portable (specifically, it will not run on Windows, nor any other platform with Microsoft Office)
- Needs to be headless and reasonably resource constrained (think VPS!).
- Preferably a self-contained, well-maintained open-source solution :)
- Text extraction would be nice (although another library could be used for that - I already have this)
- A good online service could do it as a last resort, if I fail to find a good offline solution
- Accuracy is good, but not the primary goal here.
My attempts to locate such a library have not been entirely successful. There are a few Java-based projects that seems to have sprung from OpenOffice, but they all either seem a bit heavyweight or have the wrong focus (i.e. text extraction, search, document generation).
To reiterate, I am looking to render the document (e.g. to a PNG). Speed and memory use is more important than features such as OLE images, equations, advanced formatting and whatnot.