What characters are allowed in an OAuth2 access token?

Question

RFC6749 and RFC6750 seem to disagree with one another about what characters are allowed in an OAuth2 Access Token.

Section A.12 of RFC6749 (the original OAuth2 spec) defines the access token format as follows:

A.12. "access_token" Syntax

The "access_token" element is defined in Sections 4.2.2 and 5.1:

access-token = 1*VSCHAR

In ABNF format, VSCHAR means:

VSCHAR = %x20-7E

(This is basically all printable ASCII characters)

However, in RFC6750 (which deals with the usage of OAuth2 bearer tokens) Section 2.1 seems to set out a stricter subset of allowed characters for access tokens.

The syntax for Bearer credentials is as follows:

b64token    = 1*( ALPHA / DIGIT /

                   "-" / "." / "_" / "~" / "+" / "/" ) *"="

credentials = "Bearer" 1*SP b64token

So that's a more restrictive set of characters, including only alphanumeric, six special characters, and trailing = for padding.

My questions are:

Which of these documents is controlling? Does RFC6750 take precedence because it's more restrictive?
In terms of actual implementations "in the wild", are access tokens always limited to the RFC6750 charset?
Bonus question: Does anyone know why these two specs published the same month on such closely related topics disagree on the access token format?

From my understanding, there is no contradiction between the two specifications. The RFC6749 defines the framework and the general rules ; some specifications may limit those rules for technical/security reasons. This is what the RFC6750 does. Anyway, this is a nice question so vote up for it. — Spomky-Labs, Apr 26 '18 at 09:11
I guess an extremely literal reading of these specs might say that an Authorization Server could legally issue access tokens including `&%` under RFC6749, but that according to RFC6750 no one could _use_ such tokens in an HTTP header. — bjmc, Apr 27 '18 at 15:00

score 16 · Accepted Answer · edited Oct 07 '21 at 08:14

TL;DR: There's no conflict between the standards. OAuth access tokens can generally contain any printable ASCII character, but if the access token is a Bearer token it must use "token64" syntax to be HTTP/1.1 compliant.

RFC 6749, §1.4 tells us: "An access token is a string" and "usually opaque to the client". §A.12 defines it as one or more printable ASCII characters ([ -~]+ in regex terms).

RFC 6749 defines various methods for obtaining an access token, but doesn't concern itself with how to actually use an access token, other than saying that you "present it" to a resource server, which must validate and then accept or reject it.

But RFC 6749 does require the authorization server to tell the client the token type (another string), which the client can use to determine how the access token is used.

A token type string is either an IANA-registered type name (like Bearer or mac), or a vendor URL (like http://oauth.example.org/v1), though the URL is just a conveniently namespaced identifier, and doesn't have to resolve to anything.

In most deployments, the token type will be Bearer, the semantics of which are defined in RFC 6750.

RFC 6750 defines three methods (§§2.1–2.3) of presenting a Bearer access token to the resource server. The recommended method (which resource servers must support to be standards compliant) is to send it in the HTTP Authorization header (§2.1), in which case the token must be a "b64token" ([-a-zA-Z0-9._~+/]+=* in regex terms).

This matches what the HTTP/1.1 spec calls a "token68" (RFC 7235 §2.1), and is necessary to allow the token to be used unquoted in the HTTP Authorization header. (As for why HTTP/1.1 allows those exact characters, it comes down to historical reasons related to the HTTP/1.0 and Basic authentication standards, as well as limitations in current and historical HTTP implementations. Network protocols are a messy business.)

A "b64token" (aka "token68") permits a subset of ASCII characters usually used with base64 encoding, but (despite the name) the Bearer token does not impose any base64 semantics. It's just an opaque string that the client receives from one server and passes on to another. Implementations may assign semantics to it (e.g. JWT), but that's beyond the OAuth or Bearer token standards.

RFC 6750 doesn't state that a Bearer access token must be a b64token if used with the other two (unrecommended) methods, but given that the client is supposed to be able to choose the method, it wouldn't make much sense to give it a non-b64token token.

Other OAuth token types might not rely on being passed unquoted in an HTTP header (or they might not use HTTP at all), and would thus be free to use any printable ASCII character. This might e.g. be useful for token types that are not opaque to the client; as an example, I'm currently dealing with a setup in which the access token response looks a bit like this:

{
  "access_token": "{\"endpoint\": \"srv8.example.org\", \"session_id\": \"fafc2fd\"}",
  "token_type": "http://vendor.example.org/",
  "expires_in": 3600,
  "refresh_token": "tGzv3JOkF0XG5Qx2TlKWIA"
}

Here, the access token is a JSON-encoded data structure, which the client must act upon (according to rules associated with the vendor token type) to access the protected resource.

score 1 · Answer 2 · edited Oct 07 '21 at 11:07

1

TLDR : Authorization header follow Basic schema defined in RFC2617. So the token should be base64 encoded.

This is highlighted by the following phrase of rfc6750,

The syntax of the "Authorization" header field for this scheme follows the usage of the Basic scheme defined in Section 2 of [RFC2617]

If you go and check RFC2617, following is the ABNF which make base64 encoding for user credentials.

credentials = "Basic" basic-credentials

basic-credentials = base64-user-pass

But as OP has pointed out, ABNF is defined as b64token which is allows more than base64 encoding. So in real world implementations we can see for example JWT ( ABNF of base64 and . separation) used as bearer tokens. This is acceptable as it comes within b64token ABNF.

Answers for OP's questions,

Access token can have any character from %x20-7E range. No restrictions on that and that's the definition for access token.
If Access Token is bearer token (token_type=bearer) then it must follow b64token AKA token68. This make the access token qualified to be put in Authorization header.
RFC6749 define the format of the Access token. RFC6750 define how to utilise Authorization header to transmit access token.

b64token vs token68

There seems to be some confusion on naming of b64token.

After some searching I came across following IETF discussions on RFC7235. RFC7235 define the current standard for HTTP authentication (which include Authorizationheader too)

According to those discussions, b64token is an specific encoding. And there were suggestions to rename b64token to token68. They have made this change and basically b64token refers to token68.

Appendix section explains token68 on HTTP Authorization header, (NOTE - These are extracted. Go to link to check full explanation of ABNF )

Authorization = credentials

credentials = auth-scheme [ 1SP ( token68 / [ ( "," / auth-param )( OWS "," [ OWS auth-param ] ) ] ) ]

token68 = 1( ALPHA / DIGIT / "-" / "." / "_" / "~" / "+" / "/" )"="**

So as I can see, RFC6750 is not updated with these naming (those definitions were in progress at the time of writing it).

edited Oct 07 '21 at 11:07

Community

1
1

answered Apr 26 '18 at 03:59

Kavindu Dodanduwa

12,193
3
33
46

1

I'm not sure this is correct. According to this https://github.com/bshaffer/oauth2-server-php/issues/100 RFC6750 does NOT mandate base64-encoding tokens: "Digging a bit deeper in to "HTTP/1.1, part 7: Authentication"**, however, I see that b64token is just an ABNF syntax definition allowing for characters typically used in base64, base64url, etc.. So the b64token doesn't define any encoding or decoding but rather just defines what characters can be used in the part of the Authorization header that will contain the access token." – bjmc Apr 26 '18 at 20:10
2

@bjmc I think we missed out following phrase, "The syntax of the "Authorization" header field for this scheme follows the usage of the Basic scheme defined in Section 2 of [RFC2617]" So basically it's base64 encoding. Which is allowed through the usage of b64token. I was also confused till I went through spec again. – Kavindu Dodanduwa Apr 27 '18 at 02:33
1

Are there examples of OAuth2-protected resources in the wild that require base64-encoded tokens? – bjmc Apr 27 '18 at 15:47
1

@bjmc Well, I have seen the usage of JWT access tokens, which match ABNF (ex:- https://help.salesforce.com/articleView?id=remoteaccess_oauth_jwt_flow.htm&type=5) but not a base64 encoded value. So I think usage is open (though spec specifically refer to Basic schema). Azure ad too use JWT access tokens - https://learn.microsoft.com/en-us/azure/active-directory/develop/active-directory-protocols-oauth-code#successful-response-1 – Kavindu Dodanduwa Apr 27 '18 at 17:55
That matches my experience. I have never encountered a service that required callers to base64-encode their tokens before including them in a Bearer header. JWTs are encoded, obviously, but they're not b64. – bjmc Apr 27 '18 at 22:15
@bjmc JWT match the ABNF. They contain base64 parts and to separate them. So JWT used for bearer is totally valid. Indeed there are some confusion between two specs. But as I see as long as token follow ABNF, it's fine. Also, access token is consumed by identity provider (unless it's JWT) so it can issue them in any format as long as they match ABNF. – Kavindu Dodanduwa Apr 28 '18 at 04:02
Yeah, I agree with everything you've said in your last comment. Where I disagree with your answer is I don't think that RFC6750 mandates base64-encoding Bearer tokens in headers. As you say here, it just requires that you meet the `token68` (aka `b64token`) ABNF format. And `token68` is a stricter ABNF than `VSCHAR` set out in RFC6749. – bjmc Apr 28 '18 at 16:16
@bjmc I agree with your argument. We are talking about two different specs so there could be conflicts like this. My point on base64 is merely based on highlighted phrase ""Authorization" header field for this scheme follows the usage of the Basic scheme". But ABNF allow much more than that. So actual implementations will wary as we both have seen (ex:- JWT bearer tokens). May be I should add this point as well. – Kavindu Dodanduwa Apr 29 '18 at 03:22
@bjmc And about your last point. Yes access token have a wider scope in ABNF. But if it needs to be qualified to be a bearer token, then it should match b64token ABNF. So ideally, bearer token creator (identity provider) should create them matching the b64token ABNF, use JWT or encode (ex:- base64 encoding) pure access token created in VSCHAR. – Kavindu Dodanduwa Apr 29 '18 at 03:33

score -1 · Answer 3 · edited Jun 20 '20 at 09:12

Well, I'm not an expertise but from my job experience I can tell you that you should always try to use RFC6750; it technically states to use a base64 coded string. Why?, well because most of requests where you're going to use an OAuth method will suggest you to use the Authorization HTTP header, and base64 encoding uses safe ASCII characters, this guarantees that your HTTP request will be readable in (almost) all servers. Base64 is also easier to parse, and it's also safe to use within JSON specification.

This is more likely to answer your question 2.

EDIT

well, based on the links you provided, here are their abstracts:

RFC6749

The OAuth 2.0 authorization framework enables a third-party
application to obtain limited access to an HTTP service, either on
behalf of a resource owner by orchestrating an approval interaction
between the resource owner and the HTTP service, or by allowing the
third-party application to obtain access on its own behalf. This
specification replaces and obsoletes the OAuth 1.0 protocol described in RFC 5849.

RFC6750

This specification describes how to use bearer tokens in HTTP
requests to access OAuth 2.0 protected resources. Any party in
possession of a bearer token (a "bearer") can use it to get access to the associated resources (without demonstrating possession of a
cryptographic key). To prevent misuse, bearer tokens need to be
protected from disclosure in storage and in transport.

According to their respective abstracts of each specification it is clear when you should use one or the another. In a few words, use RFC6749 when you are going to provide limited access to a web service and then ask for an authentication token. And use RFC6750 when you are going to ask for a Bearer token in your web service. Bearer tokens should always go in the Authentication header of the HTTP request, and Base64 strings are safe to be transferred directly as part of the request.

Not sure whether this is correct. As these two specs define two different things. Once describe how to define access token and other define how to use Authorization header to transmit access token. Two specs for two purposes .! — Kavindu Dodanduwa, Apr 26 '18 at 04:18

What characters are allowed in an OAuth2 access token?

3 Answers3

EDIT

Linked