Manual verification of XML Signature

Question

I can successfully do manual reference validation (canonicalize every referenced element --> SHA1 --> Base64 --> check if it's the same of DigestValue content) but I fail with the verification of the SignatureValue. Here's the SignedInfo to canonicalize and hash:

<ds:SignedInfo xmlns:ds="http://www.w3.org/2000/09/xmldsig#">
 <ds:CanonicalizationMethod Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"></ds:CanonicalizationMethod>
 <ds:SignatureMethod Algorithm="http://www.w3.org/2000/09/xmldsig#rsa-sha1"></ds:SignatureMethod>
 <ds:Reference URI="#element-1-1291739860070-11803898">
  <ds:Transforms>
   <ds:Transform Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"></ds:Transform>
  </ds:Transforms>
  <ds:DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"></ds:DigestMethod>
  <ds:DigestValue>d2cIarD4atw3HFADamfO9YTKkKs=</ds:DigestValue>
 </ds:Reference>
 <ds:Reference URI="#timestamp">
  <ds:Transforms>
   <ds:Transform Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"></ds:Transform>
  </ds:Transforms>
  <ds:DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"></ds:DigestMethod>
  <ds:DigestValue>YR/fZlwJdw+KbyP24UYiyDv8/Dc=</ds:DigestValue>
 </ds:Reference>
</ds:SignedInfo>

Ater removing all the spaces between tags (and so getting the whole element on a single line), I obtain this sha1 digest (in Base64):

6l26iBH7il/yrCQW6eEfv/VqAVo=

Now I expect to find the same digest after the decryption of the SignatureValue content, but I get a differente and longer value:

MCEwCQYFKw4DAhoFAAQU3M24VwKG02yUu6jlEH+u6R4N8Ig=

Here's some java code for the decyption:

      DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();    
  DocumentBuilder builder = dbf.newDocumentBuilder();  
  Document doc = builder.parse(new File(inputFilePath));
  NodeList nl = doc.getElementsByTagName("ds:SignatureValue");
  if (nl.getLength() == 0) {
     throw new Exception("Cannot find SignatureValue element");
   }
  String signature = "OZg96GMrGh0cEwbpHwv3KDhFtFcnzPxbwp9Xv0pgw8Mr9+NIjRlg/G1OyIZ3SdcOYqqzF4/TVLDi5VclwnjBAFl3SEdkyUbbjXVAGkSsxPQcC4un9UYcecESETlAgV8UrHV3zTrjAWQvDg/YBKveoH90FIhfAthslqeFu3h9U20=";
  X509Certificate cert = X509Certificate.getInstance(new FileInputStream(<a file path>));
  PublicKey pubkey = cert.getPublicKey();
  Cipher cipher = Cipher.getInstance("RSA","SunJCE");
  cipher.init(Cipher.DECRYPT_MODE, pubkey);
  byte[] decodedSignature = Base64Coder.decode(signature);
  cipher.update(decodedSignature);
  byte[] sha1 = cipher.doFinal();


  System.out.println(Base64Coder.encode(sha1));

The thing that confuses me much is that the two digests have different size, but of course I also need to obtain exactly the same value from the two calculations. Any suggestions? Thank you.

"Ater removing all the spaces between tags"... is that right? Looking at http://www.w3.org/TR/2001/REC-xml-c14n-20010315#Example-WhitespaceInContent it sounds like maybe you're removing too much whitespace. — Laurence Gonsalves, Dec 11 '10 at 22:13
Thanks for the answer. I understand your point but,as I said at the beginning of the question, I can succesfully do reference validation (two references, it cannot be an accident) of the same SOAP message by and only by removing those spaces, so I have to assume it's right. — Johnca, Dec 11 '10 at 22:48
Canonization is much more than removing whitespace. It handles namespace-prefix issues, attribute ordering, and generally everything that changes the byte order of the phisicaly file (and thus the hash digest), but does not change the XML infoset (== the meaning payload of the XML) — m0sa, Dec 11 '10 at 22:54
Of course. I use org.apache.xml.security.c14n.Canonicalizer but after I need to remove spaces in order to obtain the same digests of those in the xml. — Johnca, Dec 11 '10 at 23:01
Ok I found why the digest of the decryption is so large: it's made up of prefix+actual digest (http://www.w3.org/TR/2002/REC-xmldsig-core-20020212/Overview.html). Actual digest is in the last 20 bytes. Now I have two equally sized digests, but still different each other :( — Johnca, Dec 11 '10 at 23:22

score 8 · Accepted Answer · answered Dec 13 '10 at 16:25

MCEwCQYFKw4DAhoFAAQU3M24VwKG02yUu6jlEH+u6R4N8Ig= is Base64 encoding for a DER-encoded ASN.1 structure: a SEQUENCE containing first an AlgorithmIdentifier (which states that this is SHA-1, with no parameters since SHA-1 accepts none), then an OCTET STRING which contains the actual 20-byte value. In hexadecimal, the value is: dccdb8570286d36c94bba8e5107faee91e0df088.

This ASN.1 structure is part of the standard RSA signature mechanism. You are using RSA decryption to access that structure, which is non-standard. You are actually lucky to get anything at all, since RSA encryption and RSA signature are two distinct algorithms. It so happens that they both feed on the same kind of key pairs, and that the "old-style" (aka "PKCS#1 v1.5") signature and encryption schemes use similar padding techniques (similar but not identical; it is already a bit surprising that the Java implementation of RSA did not choke on the signature padding when used in decryption mode).

Anyway, 6l26iBH7il/yrCQW6eEfv/VqAVo= is Base64 encoding for a 20-byte value, which, in hexadecimal, is: ea5dba8811fb8a5ff2ac2416e9e11fbff56a015a. This is what you get by hashing the XML structure you show above, after having removed all whitespace between tags. Removing all whitespace is not proper canonicalization. Actually, as far as I know, whitespace is affected only between attributes, within the tags, but external whitespace must be kept unchanged (except for line ending normalization [the LF / CR+LF thing]).

The value which was used for the signature generation (the dccdb85...) can be obtained by using the XML object you show and by removing the leading spaces. To be clear: you copy+paste the XML into a file, then remove the leading spaces (0 to 3 spaces) on each line. You make sure that all end-of-lines use a single LF (0x0A byte) and you remove the final LF (the one just after </ds:SignedInfo>). The resulting file must have length 930 bytes, and its SHA-1 hash is the expected dccdb85... value.

THANK YOU for the clear and resolving answer!!! Where the "remove leading spaces" idea comes from? Anyway I wonder why the org.apache.xml.security.c14n.Canonicalizer doesn't remove them and why I need two different xml representations for the two tasks: for the reference validation I have to remove all the spaces between tags whereas for the signature validation I have to do what you wrote. Anyway this xml message was sent from a JBoss deployed servlet and received by a JBoss deployed Web Service. Maybe JBoss makes non-standard work? — Johnca, Dec 14 '10 at 11:06
I saw the "remove leading spaces" on a web page which tries to explain canonicalization (I don't remember which). Canonicalization keeps text content unchanged, including spaces (the Apache canonicalizer is correct in not removing them). Since spaces are troublesome, it is _recommended_ to initially build the XML without any indentation (i.e. no leading space). My guess is that the original XML did not have the leading spaces, and they were added at some point (_after_ signature computation) as a "reading helper" (which messes the signature). — Thomas Pornin, Dec 14 '10 at 11:43

score 0 · Answer 2 · answered Jul 07 '12 at 02:45

Looking at your particular XML token, I can tell you a few things.

You are using the Canonicalization method Exclusive XML Canonicalization Version 1.0. This is a very IMPORTANT factor in ensuring that you produce the right digest values and signature.
You are using the same Canonicalization method both for computing the reference digests, and for canonicalizing the SignedInfo before producing the signature.

The specification for Exclusive XML Canonicalizaiton Version 1.0 is produced by W3C and can be found at its respective W3C Recommendation. If you are computing your values manually, be sure that you are conforming exactly to the specification, because canonicalization is a hard thing to get right, and is very important to do this correctly, otherwise your values will be incorrect.

I just wrote an extensive article describing the XML Signature validation process. The article is located at my blog. It describes the process in much more detail than my answer, as there are many intricacies to XML Signature. It also contains links to prevalent specifications and RFCs.

Manual verification of XML Signature

2 Answers2

Linked