Sometimes when users cut and past their x509 certificates into our system from the web, spaces get mixed up in in there.
Is it safe to assume that spaces are not valid characters in x509 certificates and strip them out?
Sometimes when users cut and past their x509 certificates into our system from the web, spaces get mixed up in in there.
Is it safe to assume that spaces are not valid characters in x509 certificates and strip them out?
I assume who are talking about PEM encoded certificate, i.e. a certificate with a -----BEGIN CERTIFICATE-----
header and a -----END CERTIFICATE-----
footer and which looks like that:
-----BEGIN CERTIFICATE-----
MIICwzCCAaugAwIBAgIKUXEyN2GLpe8......
-----END CERTIFICATE-----
In that case the certificate content is encoded with base64. Since a certificate is a digitally signed object you cannot change a single bit, otherwise the signature validation fails. But the space characters (including tabulations or line feed) are not valid base64 characters. If some space characters has been added to certificate string you could probably safely remove them since they are not valid characters. A robust certificate parser will probably just ignore them. Note that it is a common practice to split the PEM encoded certificate into lines of 64 columns; the certificate reader will ignore the added new-line characters.
The good news: after removing these additional characters, thanks to the digital signature, if the certificate is successfully parsed, it means that its integrity is ok.
Yes, spaces are allowed, according to RFC 7468.
First of all, traditional base64 decoders (as specified in RFC 3548 or RFC 4648) do not allow unexpected bytes (such as white space) in the octet stream. So, according to those RFCs, base64-encoded data with white space is invalid.
However, base64 encoding for MIME (RFC 2045) is explicit in section 6.8 and allows for decoding such data:
All line breaks or other characters not found in Table 1 must be ignored by decoding software.
Unfortunately, there has never been a clear specification saying that "PEM-encoded" x509 certificates must use RFC 2045-compliant base64 encoding (also see Where is the PEM file format specified?)
Since 2015 there is a definite source that clarifies the question asked here. It's RFC 7468, which specifies text encoding of certificates. It says:
Furthermore, parsers SHOULD ignore whitespace and other non- base64 characters
The most lax parser implementations are not line-oriented at all and will accept any mixture of whitespace outside of the encapsulation boundaries