Doc4j: Compare two documents fails due to different Element-types

Question

I try to write some JUnit tests for a Docx4J generator I have written. I want to compare the output node of my generator with an expected node that I want to load from a string.

So, I create my "actual" node (generator output) like so:

Node xmlNodeActual = XmlUtils.marshaltoW3CDomDocument(actual).getDocumentElement();

Where "actual" is the Object that was created by my generator.

For my "expected" node, I have written the following code:

Document doc = docBuilder.parse(new InputSource(new ByteArrayInputStream(strXmlNode.getBytes("utf-8")))):
Node xmlNodeExpected = doc.getDocumentElement();

strXmlNode is a string holding the expected xml. Although my two nodes are equal as far as I can tell from a visual diff, calling the following yields 'false' as a result:

xmlNodeActual.isEqualNode(xmlNodeExpected)

I suspect the reason is that the runtime types of the two nodes differ:

xmlNodeActual: org.apache.xerces.dom.DeferredElementImpl
xmlNodeExpected: org.apache.xerces.dom.ElementNSImpl

I like my test design since it would allow me to write a lot of test cases rather quickly for a large generator. However, I don't see a way to utilize this approach in combination with "isEqualNode". Do I have to write my own comparer or is there a way I am not aware of to make sure the types of the nodes are the same?

score 0 · Answer 1 · answered Aug 09 '17 at 20:29

0

One problem with using a method like this is that it only gives a boolean answer, it doesn't tell you what the actual differences between the two nodes are. Another problem is that you can't tell it what differences you consider significant: for example (as far as I can see) redundant namespace declarations are considered significant by this particular method. Whitespace is often problematic. I had the same problem using the XPath deep-equal() method, and wrote the saxon:deep-equal variant as a result. But I now prefer to test expected results using a set of XPath assertions. The W3C XSLT test suite uses this technique with test assertions like this:

<result>
     <all-of>
        <assert>/root/p[1]/text()[1] = 'Tekst '</assert>
        <assert>/root/p[1]/text()[2] = ' etc..'</assert>
        <assert>/root/p[2]/text()[1] = 'Tekst '</assert>
        <assert>/root/p[2]/text()[2] = ' etc..'</assert>
     </all-of>
  </result>

I did at one time have a little tool that would generate such a list of assertions from an XML document, but I now tend to do them manually. The great advantage is that if something is wrong, the diagnostics will tell you which assertion failed.

answered Aug 09 '17 at 20:29

Michael Kay

156,231
11
92
164

thanks Michael, I am aware of the limitations of my approach. But given the technological and non-functional constraints I'm working in, it would be a good compromise. The nodes I compare are small, but many. Both redundant namespace dependencies and whitespaces are no issues within my setup, but thanks for pointing it out. I've written something like saxon:deep-equal for a different language than XML before, but my question is less about my approach than if there is an easy way to control what Node-types the parses produces. – Robert Walter Aug 09 '17 at 21:32
The spec gives no hint that the comparison is allowed to fail based on the implementation class of the second Node tree, but of course there could be a bug in the implementation. Personally I think it's more likely that there is a minor difference between the two trees that you have overlooked. It might be worth downloading Saxon and seeing what fn:deep-equal() (or saxon:deep-equal()) has to say. – Michael Kay Aug 10 '17 at 07:15
I think I have to disagree here: check out the [doc](https://docs.oracle.com/javase/7/docs/api/org/w3c/dom/Node.html#isEqualNode). It reads: _Two nodes are equal if and only if the following conditions are satisfied:_ _The two nodes are of the same type._ _..._ – Robert Walter Aug 10 '17 at 09:21
My reading of "type" there is "getNodeType()", e.g. both attributes or both comments. I don't think it's intended to mean they must have the same implementation class, since that's often outside the programmer's control. – Michael Kay Aug 10 '17 at 18:01

score 0 · Answer 2 · answered Aug 09 '17 at 23:52

0

Three other possibilities you might consider:

answered Aug 09 '17 at 23:52

JasonPlutext

15,352
4
44
84

Hi @JasonPlutext. Thanks for the hints. Those, in combination with a good night of sleep, led me to the pragmatic solution I was looking for (will update my question soon with what I ended up with for now). In general, I think xmlunit is the right way to go. On a side note: my reputation isn't high enough yet to show my upvote of this answer. – Robert Walter Aug 10 '17 at 06:46

score 0 · Accepted Answer · answered Aug 10 '17 at 07:01

Notice that @Michael Kay and @JasonPlutext contributed interesting and better alternatives on how to test XML output in general, which you might want to consider.

As for my specific question and problem, i.e. trivially comparing two XML nodes with "isEqualNode", one stemming from a input string, one stemming from data transformation, I had to do the following: Instead of parsing the string, I can unmarshal it via a InputStream, thus getting the desired node types.

// creating the "actual" node I want to test (nothing changed here)
Node xmlNodeActual = XmlUtils.marshaltoW3CDomDocument(actual).getDocumentElement();

//...

// Instead of parsing the string, just unmarshal and marshal it once
Object expected = XmlUtils.unmarshal(new ByteArrayInputStream(strXmlNode.getBytes("utf-8")));
Node xmlNodeExpected = XmlUtils.marshaltoW3CDomDocument(expected).getDocumentElement();
if(!xmlNodeActual.isEqualNode(xmlNodeExpected)) {
// ...
}

This produces the same node types and works for my setup as expected. Still, this way of comparing two XML trees has several flaws as has been pointed out by Michael Kay, so don't consider this a best practice and rather resort to another answer for general XML comparison.

Doc4j: Compare two documents fails due to different Element-types

3 Answers3