10

Sometimes when users cut and past their x509 certificates into our system from the web, spaces get mixed up in in there.

Is it safe to assume that spaces are not valid characters in x509 certificates and strip them out?

Nick
  • 3,217
  • 5
  • 30
  • 42

2 Answers2

10

I assume who are talking about PEM encoded certificate, i.e. a certificate with a -----BEGIN CERTIFICATE----- header and a -----END CERTIFICATE----- footer and which looks like that:

-----BEGIN CERTIFICATE-----
MIICwzCCAaugAwIBAgIKUXEyN2GLpe8......
-----END CERTIFICATE-----

In that case the certificate content is encoded with base64. Since a certificate is a digitally signed object you cannot change a single bit, otherwise the signature validation fails. But the space characters (including tabulations or line feed) are not valid base64 characters. If some space characters has been added to certificate string you could probably safely remove them since they are not valid characters. A robust certificate parser will probably just ignore them. Note that it is a common practice to split the PEM encoded certificate into lines of 64 columns; the certificate reader will ignore the added new-line characters.

The good news: after removing these additional characters, thanks to the digital signature, if the certificate is successfully parsed, it means that its integrity is ok.

Jcs
  • 13,279
  • 5
  • 53
  • 70
  • Does this mean that the certificate content can only contain the letters A to Z, the letters a to z, the numerals (0 - 9), and the "+" and "/" symbols? Is a backslash "\" legal? – Andy J Aug 09 '18 at 06:07
  • @AndyJ A-Z a-z 0-9 + and / are the only legal chars in base64. A certificate is a binary object which is traditionally represented with a base64 encoding. If the certificate was encoded in hexadecimal the only used chars would be a-f and 0-9 – Jcs Dec 24 '20 at 10:18
5

Yes, spaces are allowed, according to RFC 7468.

First of all, traditional base64 decoders (as specified in RFC 3548 or RFC 4648) do not allow unexpected bytes (such as white space) in the octet stream. So, according to those RFCs, base64-encoded data with white space is invalid.

However, base64 encoding for MIME (RFC 2045) is explicit in section 6.8 and allows for decoding such data:

All line breaks or other characters not found in Table 1 must be ignored by decoding software.

Unfortunately, there has never been a clear specification saying that "PEM-encoded" x509 certificates must use RFC 2045-compliant base64 encoding (also see Where is the PEM file format specified?)

Since 2015 there is a definite source that clarifies the question asked here. It's RFC 7468, which specifies text encoding of certificates. It says:

Furthermore, parsers SHOULD ignore whitespace and other non- base64 characters

The most lax parser implementations are not line-oriented at all and will accept any mixture of whitespace outside of the encapsulation boundaries

Community
  • 1
  • 1
Dr. Jan-Philip Gehrcke
  • 33,287
  • 14
  • 85
  • 130
  • 1
    The quoted RFC 7468 excerpt does not say spaces are allowed, it only says parsers SHOULD ignore them, which means almost the opposite: spaces are not allowed if you want them to be parsable by all compliant parsers. – Remember Monica Feb 06 '20 at 11:17