Extract PKCS1 from Signed PDF

Question

I have to extract signature fields from PDF signed document to create a printed signature version. Until now I've been able to recover signer certificate, reason, signing date and other fields using this iText code:

PdfReader reader = new PdfReader(signedPdf);
AcroFields af = reader.getAcroFields();
ArrayList<String> names = af.getSignatureNames();

SimpleDateFormat sdf = new SimpleDateFormat(
                    "dd/MM/yyyy 'a las' HH:mm:ss");

for (int i = 0; i < names.size(); ++i) {

    StringBuilder sb = new StringBuilder();
    String name = names.get(i);
    PdfPKCS7 pk = af.verifySignature(name);

    String firmante = CertificateInfo.getSubjectFields(
            pk.getSigningCertificate()).getField("CN");
    sb.append("Nombre del firmante: " + firmante + "\n");

    Date signDate = pk.getSignDate().getTime();
    String sdate = sdf.format(signDate);
    sb.append("Fecha y hora de la firma: " + sdate + "\n");


    String razon = pk.getReason();
    sb.append("proposito: " + razon + "\n");
}

As far as I know, the PDF signature was made with iText PdfPkcs7 class using setExternalDigest method to add a PKCS1 byte array created in an external application. The file looks correctly signed and validated by external tools.

// Create the signature
PdfPKCS7 sgn = new PdfPKCS7(null, chain, "SHA1", "BC", null, false);

//pkcs1Bytes is a byte array with the signed document hash
sgn.setExternalDigest(pkcs1Bytes, null, "RSA");

However, one of the required fields for printed version is a "signature digital stamp" which is a base 64 string of signed document hash or PKCS1.

It's possible to extract that PKCS1 bytes from the signed PDF document?

EDITED: I forgot to mention that when I use PdfPKCS7.getEncodedPKCS1() method after verifying signature it throws ExceptionConverter: java.security.SignatureException: object not initialized for signing

Why you do not directly use `PdfPKCS7.getEncodedPKCS1()` after verifying the signature? — pedrofb, Jul 28 '16 at 07:00
@pedrofb Sorry, I forgot to mention that when I use that method after verifying it throws `ExceptionConverter: java.security.SignatureException: object not initialized for signing` — Hugo Hernandez, Jul 28 '16 at 07:37
ok, I just have reviewed the javadoc. Looking at the source code, that method performs the digital signature, so it is not suitable for your needs. The `messageDigest` is inside the class and is used in verify() method, but taking a rapid look to code, i have not seen how to extract it — pedrofb, Jul 28 '16 at 08:08
@pedrofb Thanks, it's good to know. That changes my perspective of the problem. — Hugo Hernandez, Jul 28 '16 at 08:20
How about simply extracting the CMS signature container using iText and then analyzing that container using BouncyCastle a version of which you already have linked in? — mkl, Jul 28 '16 at 13:24
@mkl The method PdfPKCS7.getEncodedPKCS7() also fails with `ExceptionConverter: java.security.SignatureException: object not initialized for signing`. I haven't found find any code example for doing that. :( — Hugo Hernandez, Jul 29 '16 at 04:48
Just to be sure: Which digest value exactly do you want? In case of CMS signatures usually multiple hashes are involved, cf. [this answer](http://stackoverflow.com/a/28429984/1729265). Do you want the digest value of the signed document stream or do you want the digest value of the signed attributes? — mkl, Jul 29 '16 at 08:02
@mkl According to what they told me, the "signature stamp" is the Base64 string of PKCS1 bytes. That is, (in this implementation) the signed hash of target document which are assigned in setExternalDigest method later. — Hugo Hernandez, Jul 29 '16 at 23:28
That sounds like the signature value. Which is the signed hash. But this hash is not immediately the hash of the document but instead the hash of the signed attributes, and the document hash is the value of one of those attributes. — mkl, Jul 30 '16 at 07:03

score 3 · Accepted Answer · edited May 23 '17 at 12:08

I have reviewed the code and seems the class PdfPKCS7 does not allow to access the digest. But, the content is stored in a private member PdfPKCS7.digest. So using reflection will allow you to extract it. I have found a similar example here and here (is basically the same)

PdfPKCS7 pdfPkcs7 = acroFields.verifySignature(name);
pdfPkcs7.verify();

Field digestField = PdfPKCS7.class.getDeclaredField("digest");
digestField.setAccessible(true);
byte[] digest = (byte[]) digestField.get(pdfPkcs7);

I think the variable you need is digest because the value is assigned in getEncodedPKCS1 when performing the signature

 public byte[] getEncodedPKCS1() {
   try {
        if (externalDigest != null)
            digest = externalDigest;
        else
            digest = sig.sign();

     //skipped content

And is used in verify() in the following way verifyResult = sig.verify(digest);

Note that digest is a private variable, so the name or content could depend on the version. Review the code of your specific version.

This works for me! It's the code that adds less complexity and simpler is always better, thank you. — Hugo Hernandez, Aug 01 '16 at 05:02
While this solution indeed does not look complicated, reflection may prove to be an issue in some frameworks (where use reflection to access private information may be forbidden). In the end, therefore, it depends on the details of the case at hand. — mkl, Aug 01 '16 at 12:24

score 3 · Answer 2 · edited May 23 '17 at 12:32

Considering your code I assume you are using a 5.x iText version, not a 7.x.

You can either use reflection (cf. this older answer or pedrofb's answer here) or you can simply extract the CMS signature container using iText and then analyze that container using BouncyCastle; a version of BC usually already is present anyways if you use signature related functionality of iText.

As the OP has already observed, PdfPKCS7.getEncodedPKCS7() fails with "ExceptionConverter: java.security.SignatureException: object not initialized for signing". The reason is that this method is meant for retrieving a signature container newly constructed by the PdfPKCS7 instance.

To extract the CMS signature container using iText you can use this code instead:

AcroFields fields = reader.getAcroFields();
PdfDictionary sigDict = fields.getSignatureDictionary(name); 
PdfString contents = sigDict.getAsString(PdfName.CONTENTS);
byte[] contentBytes = contents.getOriginalBytes();

contentBytes now contains the encoded CMS container (plus some trailing bytes, usually null bytes, as the Contents value usually is larger than required for the signature container).

Analyzing this container using BouncyCastle is not difficult but the details may depend on the exact BouncyCastle version you use.

Extract PKCS1 from Signed PDF

2 Answers2