2

I need to do docx manipulation (find/replace on placeholders and checking/unchecking checkboxes). Since ColdFusion 10 integrates well with Java, I decided to try and use the Java library docx4j, which basically mimics the OpenXML SDK (.net platform).

I have the docx4j JAR inside a custom folder, which I have setup in my Application.cfc via JavaSettings (new in CF10, and I tried it with other JARS and it works):

<cfcomponent output="false">

    <cfset this.javaSettings =
        {LoadPaths = ["/myJava/lib"], loadColdFusionClassPath = true, reloadOnChange= true, 
        watchInterval = 100, watchExtensions = "jar,class,xml"} />

</cfcomponent>

Now, I'm trying to use this sample:https://github.com/plutext/docx4j/blob/master/src/main/java/org/docx4j/samples/VariableReplace.java

But trying to call the WordprocessingMLPackage fails with the function CreateObject() saying that particular class doesn't exist:

<cfset docObj = createObject("java","org.docx4j.openpackaging.packages.WordprocessingMLPackage") />

Any ideas? I'm not really a Java guy, but there are not many options out there for docx manipulation.

  • The class does not exist...hmmm..seems like it was not found. Are you sure the jar file is where it is supposed to be? Used java with Coldfusion 7 but it seems like they are playing nicely now. – Andreas May 30 '12 at 23:18
  • Post the full stack trace. However, I will say I tried using docxj with CF9 a while back and could not make it work. It is nothing against docx4j. It seemed like a pretty good library. I just ran into too many class loader conflicts between docx4j's dependencies and CF's internal libraries. Unfortunately, I was not able to figure out how to resolve them - even with the JavaLoader. I have not tried it with CF10 though, so YMMV. – Leigh May 31 '12 at 00:45
  • Using the new JavaSettings property in CF10, I didn't have any issues loading up any other JAR files and accessing classes. I thought maybe it was that particular class, so I tried a different class to create an object out of (org.docx4j.openpackaging.parts.WordprocessingML.MainDocumentPart), and it worked just fine. I'll keep fooling around with it trying to actually see if I can build/manipulate a document. Edit: Alright now it is working, I think the problem was I wasn't supplying the _init()_ constructor. –  May 31 '12 at 04:25
  • Cool! If the class loader issues are gone I might give it another whirl. *Now* I am really excited about CF10 :) – Leigh May 31 '12 at 04:25

3 Answers3

1

Alright. Seems like I got everything working. I just got to figure out how to do a find/replace, and everything else I want to do in a docx document. Here's my code so far to show you guys that it looks like it is working (make sure that your Application.cfc looks like the original post if you are on CF10):

<cfscript>

    docPackageObj = createObject("java","org.docx4j.openpackaging.packages.WordprocessingMLPackage").init();
    docObj = createObject("java","org.docx4j.openpackaging.parts.WordprocessingML.MainDocumentPart").init();
    xmlUtilObj = createObject("java","org.docx4j.XmlUtils").init();
    wmlDocObj = createObject("java","org.docx4j.wml.Document").init();
    saveToZipFile = createObject("java","org.docx4j.openpackaging.io.SaveToZipFile").init(docPackageObj);

    strFilePath = getDirectoryFromPath(getCurrentTemplatePath()) & "testDoc.docx";

    wordMLPackage = 
        docPackageObj.load(createObject("java","java.io.File").init(javaCast("string",strFilePath)));

    documentPart = wordMLPackage.getMainDocumentPart();

    // unmarshallFromTemplate requires string input     
    strXml = xmlUtilObj.marshaltoString(documentPart.getJaxbElement(),true);

    writeDump(var="#strXml#");

</cfscript>

Now, does anybody know how to cast structures in ColdFusion into hashmaps (or collections in general)? I think structures in CF are actually util.Vector, whereas hashmaps are util.HashMap. All of the examples I see with Docx4j that demonstrates find/replace in placeholders use this:

HashMap<String, String> mappings = new HashMap<String, String>();
mappings.put("colour", "green");
mappings.put("icecream", "chocolate");
  • Very nice, I cannot wait to try it out. As far as data types, CF structures are `java.util.Map` objects and arrays are `java.util.Vector`. So I am guessing a structure will work. For more details see [data type conversions](http://help.adobe.com/en_US/ColdFusion/10.0/Developing/WSc3ff6d0ea77859461172e0811cbec22c24-7884.html) **Edit**: Quick question, which docx4j version are you using and what is your o/s? – Leigh Jun 01 '12 at 02:34
  • Also, did you do anything special with the jars or just load everything ie all dependencies? – Leigh Jun 01 '12 at 02:43
  • Okay thanks. On second thought you may not be able to use a CF structure. It looks like that library coded to the implementation (HashMap) instead of the interface (Map)... ugh. I hate when they do that. – Leigh Jun 01 '12 at 02:53
  • I used the newest docx4j (2.8.0) and also placed log4j-1.2.15.jar into the same folder as docx4j. I think log4j is the only dependency that you have to have (sorry I forgot to mention that). **EDIT** The OS I'm using is MS Server 2008 R2 at work; and it looks like it is working at home on Windows 7 using the built in Tomcat server that ships with CF10 (using the example above, all it does is dump the XML text of the document, but you MUST have text inside that document (it can't be blank or it will toss a byte error). –  Jun 01 '12 at 02:58
  • 1
    Hm... it has been a while but I thought it required more than just docx4j. Though many of the libraries already exist in CF. So that may have changed. (Edit) I just tried it and the replace example works with a `HashMap` - which is awesome. I am so glad you opened this thread! Wish I could upvote you twice :) – Leigh Jun 01 '12 at 03:06
  • Could you post how you implemented the rest of the example? The hashmap and _documentPart.setJaxbElement((Document) obj);_ part of the example kind of tossed me off (I'm not the greatest at Java). But I believe for find/replace in a docx document, the placeholder can be split across multiple w:t elements (so I guess an XPath expression is needed). –  Jun 01 '12 at 04:58
  • In that simple example `unmarshallFromTemplate` is supposed to do it all for you using the settings in the `HashMap`. So you do not need to do much beyond translating the remaining lines. Note, I only tested it with the sample file. So I am not sure how thorough it is. Anyway, I will post the code I used in a minute.. – Leigh Jun 01 '12 at 05:46
0

Have you tried setting loadColdFusionClassPath = false instead of true? Perhaps there is a conflict with some of the JARs that ship w/ CF.

Sean Coyne
  • 3,864
  • 21
  • 24
0

(Not really a new answer, but it is too much code for comments ..)

Here is the full code for the docx4j VariableReplace.java example

<cfscript>
    saveToDisk = true;
    inputFilePath = ExpandPath("./docx4j/sample-docs/word/unmarshallFromTemplateExample.docx");
    outputFilePath = ExpandPath("./OUT_VariableReplace.docx");

    inputFile = createObject("java", "java.io.File").init(inputFilePath);
    wordMLPackage = createObject("java","org.docx4j.openpackaging.packages.WordprocessingMLPackage").load(inputFile);
    documentPart = wordMLPackage.getMainDocumentPart();

    XmlUtils = createObject("java","org.docx4j.XmlUtils");
    xmlString  = XmlUtils.marshaltoString(documentPart.getJaxbElement(),true);

    mappings = createObject("java", "java.util.HashMap").init();
    mappings["colour"] = "green";
    mappings["icecream"] =  "chocolate";
    obj = XmlUtils.unmarshallFromTemplate(xmlString , mappings);
    documentPart.setJaxbElement(obj);

    if (saveToDisk) {
        saveToZipFile = createObject("java","org.docx4j.openpackaging.io.SaveToZipFile").init(wordMLPackage);
        SaveToZipFile.save( outputFilePath );
    } 
    else {
        WriteDump(XmlUtils.marshaltoString(documentPart.getJaxbElement(), true, true));
    }
</cfscript>
Leigh
  • 28,765
  • 10
  • 55
  • 103
  • Looks like it works, you just have to adjust the HashMap since the dot notation won't work. Instead this does (I just passed in string variables): _mappings.put("colour",#trim(variables.myColour)#);_ _mappings.put("icecream",#trim(variables.myIcecream)#);_ One thing I have noticed is that if you modify the document (the original with placeholders) at all, this will cause the placeholder to be spread out on multiple w:t elements causing it not to work. I'm guessing we will have to use some type of XPath expression? I'm digging through the samples currently for a solution. –  Jun 01 '12 at 18:26
  • Sorry, my bad. It was late and pasted the wrong version. It was supposed to be bracket notation. (Updated). I am not sure about the elements. I would have to a closer look to see what is happening. – Leigh Jun 01 '12 at 18:43
  • Yeah it looks like they're could potentially be w:r, w:t and w:rPr elements inside a single placeholder if you edit the document and change things around, etc. So I'm trying to figure out a way that a placeholder can be found/replaced no matter if it is spread out amongst multiple elements that Word might toss in it. :/ –  Jun 04 '12 at 17:06
  • I have not had much time to review the samples. Did you try looking at the openxml docs/forums? Because obviously any xml/regex solutions are going to be portable to CF. – Leigh Jun 04 '12 at 17:46
  • I found out how it is done with the Open XML SDK: http://msdn.microsoft.com/en-us/library/bb508261.aspx Looks like they use a built in regular expression class. Can't really find any documentation of a foolproof Docx4j example though. I posted this question on the Docx4j forum, but the response is above my head. –  Jun 04 '12 at 18:29
  • From the comments, it seems like that method may suffer from the same problem. I am not clear on Jason's response (in the docx4j forums), so I posted a follow up question. – Leigh Jun 04 '12 at 18:56
  • I saw the follow up, and his response. I checked out the examples and yeah.. over my head. I'll try and see if I can figure it out since a placeholder find/replace will not work very well unless it can do a find/replace on placeholders that are spreadout amongst multiple runs/properties, etc. –  Jun 07 '12 at 15:57
  • At least we know it is possible from what Jason said. I still have to look over the example though. This week has been pretty hectic. – Leigh Jun 07 '12 at 17:37
  • Leigh, I was wondering if you tested out anything with Docx4j yet? Jason made a variablePrepare Java example albeit a little over my head. What it does is clean up a docx document so that variables are not across multiple runs. https://github.com/plutext/docx4j/blob/master/src/main/java/org/docx4j/samples/VariablePrepare.java Not sure if that method above would be better than "binding content controls to an XML Part (via XPath)" way. –  Jul 08 '12 at 17:00
  • Sorry, I started to look at it but got crazy busy with a project and have not had time to get back to it yet. – Leigh Jul 15 '12 at 21:03