8

I'm decoding a X.509 Certificate in ASN.1 format. I'm decoding it successfully, traversing the structure, but there is one thing that I don't understand.

There are some scenarios where I get an octet string and this website that I am playing with (http://lapo.it/asn1js/) shows that these octet strings actually contain more of the ASN.1 tree. This website annotates such octet strings with (encapsulates)

My question is this: how do I know during parsing that an octet string actually encapsulates something more? Do I just try to parse it, looking if I get a tag and valid length? If not then it is pure bytes data? And if yes then it is a valid sub-tree?

Or is this meant to be output as bytes and the consumer should then only try to parse it if he knows that it is encoded data from for certain keys?

Take the example that is already loaded on the site and hit "decode". I am referring for example to offset 332 which is an octet string that encapsulates a bit string.

Cocoanetics
  • 8,171
  • 2
  • 30
  • 57
  • Offset 332 in that example is the *keyUsage*, which is [just a bitstring](http://www.alvestrand.no/objectid/2.5.29.15.html). A more interesting example might be `subjectAltName` 2.5.29.17, not present in that cert, but https://google.com/ has a good one to test parsing with. – mr.spuratic Mar 08 '13 at 17:12
  • Sorry, just to be clear I mean the certificate on the google.com https site, it has 40+ altNames. This is *not* a LMGTFY ;-) – mr.spuratic Mar 08 '13 at 18:50
  • @mr.spuratic Offset 332 in that example has an octet string that contains a bitstring. My question is if there is some sort of rule how I can deduct that a primitive encapsulates something else. Or would I just try to decode it and if lengths and type all work, then take the decoded values instead? Or would I know from some spec that things like subjectAltName is always an octet string that encodes something more? – Cocoanetics Mar 09 '13 at 11:27

2 Answers2

9

This is what "extensions" looks like in ASN.1 speak (RFC 2459 §B.2 — I know that RFC is "obsolete", but that useful appendix isn't present in the later versions).

Extensions ::= SEQUENCE OF Extension

Extension ::= SEQUENCE {
extnId     OBJECT IDENTIFIER,
critical   BOOLEAN DEFAULT FALSE,
extnValue  OCTET STRING }

Every extension payload is encapsulated within an OCTET STRING. The OID of the extensions tells you what to expect within that octet string. In the case of keyUsage it's a BIT STRING (§4.2.1.3).

And now I have an answer about my own question on subjectAltName, it's in §4.2.1.7.

One benefit of using OCTET STRING for the content is that, as per spec, unknown (non-critical) extensions can be identified as such and trivially be skipped over (though I think DER makes it trivial too).

Community
  • 1
  • 1
mr.spuratic
  • 9,767
  • 3
  • 34
  • 24
  • 1
    Best explanation I have gotten to an ASN1 question in a long time. ;-) – Cocoanetics Mar 10 '13 at 09:08
  • When I decode OCTET STRING; if no DEFN is available (OID) then I test if it contains a VALID ASN1 sequence and dual decode as a BLOB and as a CONTAINED sequence if valid. – smallscript Mar 24 '21 at 02:01
  • 1
    Note that with the case of CRITICAL:, DER encoding has special rules about not including DEFAULT values (redundant). So you may not see the BOOLEAN present in the ASN1 depending on which SCHEMA was used for the EXTENSION. – smallscript Mar 24 '21 at 02:03
2

And the way to tell ASN.1 tools to deal with that encapsulation is by using the keyword "CONTAINING". For example (this is not the actual/correct certificate spec, but it should give you an idea):

TstCert DEFINITIONS IMPLICIT TAGS ::=
BEGIN
   Sun ::= SEQUENCE {
       subjAltType OBJECT IDENTIFIER,
       name GenNames
   }

   GenNames ::= SEQUENCE SIZE (1..5) OF GenName

   GenName ::= CHOICE {
       otherName   [0] OtherName,
       rfc822Name  [1] UTF8String
   }

   OtherName ::= OCTET STRING (CONTAINING SEQUENCE {
       type-id OBJECT IDENTIFIER,
       value [0] EXPLICIT UTF8String
   } )
END
Mouse
  • 31
  • 4