0

I am trying to fetch the xml from the file, but while fetching the order of the attributes are changing I know it will not matter but in my case it does, as I am hashing the document

I am trying the below code but it is ordering the attributes in an alphabetical manner

File fXmlFile = new File("C:\\Users\\Desktop\\abc.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder;
dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);
DOMSource domSource = new DOMSource(doc);
StringWriter writer = new StringWriter();
StreamResult result = new StreamResult(writer);
TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer = tf.newTransformer();
transformer.transform(domSource, result);
String xml=writer.toString();

this is the xml that I am trying to read in the variable xml, I am getting the xml with the attributes sorted in an alphabetical order

anb004
  • 11
  • 4
  • As you get the attributes in a fixed order, it means you _can_ make a hash for structural identical XMLs. However it would not detect equal XMLs with moved attributes. Why then not make a hash on the file itself? Whitespace? For XML you might try a pull parser maybe. – Joop Eggen Jan 03 '19 at 10:26
  • @JoopEggen I actually want to get an attribute from that xml, Store it , Remove that attribute from the xml and then hash it – anb004 Jan 03 '19 at 10:41
  • That means you want a hash of the document's content model, not just a file. To do that you need to compute their canonical form (whichever type suits you best) then compute the hash of it. They solve the problem you're describing by listing attributes sorted by name. – kumesana Jan 03 '19 at 10:58
  • Will try this and update – anb004 Jan 04 '19 at 05:08

1 Answers1

0

Attribute order has no meaning in XML, and tools that process XML are at liberty to change the order.

When you are comparing documents for equivalence (which you appear to be doing) then you should use comparison and hash functions that do not depend on the order of attributes.

Michael Kay
  • 156,231
  • 11
  • 92
  • 164