47

In OAuth, the initial authorization request has a state parameter. Apparently it's there for security reasons, but I don't really understand against what it protects... For instance, on GitHub the description of this parameter is:

An unguessable random string. It is used to protect against cross-site request forgery attacks.

From what I can see, the state from the authorization request is just passed as a parameter to the redirect URL like this:

http://<redirect_url>?code=17b1a8df59ddd92c5c3b&state=a4e0761e-8c21-4e20-819d-5a4daeab4ea9

Could someone explain the exact purpose of this parameter?

Thomas Levesque
  • 286,951
  • 70
  • 623
  • 758
  • 1
    See also [csrf - OAuth2 Cross Site Request Forgery, and state parameter - Information Security Stack Exchange](https://security.stackexchange.com/questions/20187/oauth2-cross-site-request-forgery-and-state-parameter) – Peter V. Mørch Mar 08 '21 at 11:39
  • I found this great article while searching for the same question: https://medium.com/@benjamin.botto/oauth-replay-attack-mitigation-18655a62fe53 – ahmedakef Mar 21 '23 at 23:42

2 Answers2

58

The state parameter is used to protect against XSRF. Your application generates a random string and sends it to the authorization server using the state parameter. The authorization server sends back the state parameter. If both state are the same => OK. If state parameters are different, someone else has initiated the request.

The example from Google is maybe clearer: https://developers.google.com/accounts/docs/OAuth2Login?hl=en#createxsrftoken

Turbcool
  • 84
  • 8
meziantou
  • 20,589
  • 7
  • 64
  • 83
  • 6
    "someone else has initiated the request": thanks, that's what I was missing. I'm not in the context of a web app, so it doesn't apply to my case (I just detect the redirect in a WebBrowser control in a desktop app, no one is going to send requests to me...) – Thomas Levesque Oct 01 '14 at 01:22
  • 19
    The developers of [ckanext-oauth2](https://github.com/conwetlab/ckanext-oauth2/blob/master/ckanext/oauth2/oauth2.py#L46) use the state parameter also to store info about the previously visited page, to redirect the user back there after login, e.g.: `{"came_from": "/dashboard"}`. They *base64* encode it to make it URL-safe and then use it for the `state` parameter. – jeverling Nov 12 '18 at 19:23
  • 2
    @jeverling wouldn't that be guessable? – Sriram Kailasam Aug 04 '20 at 11:18
  • @SriramKailasam, yes you are right, it is guessable. You are right that `came_from` is not suited as XSRF token. But I think in the case of this plugin they don't use the `state` parameter for a XSRF token. I don't think there is anything keeping you from storing a XSRF token, plus something like e.g. `came_from` in the `state` parameter. – jeverling Aug 16 '20 at 17:27
  • 9
    You missed a really important point, the state parameter should be somehow tied to your session. – Aditya Jan 13 '21 at 07:31
  • 2
    Would you say this is only necessary for applications that require really high security? Or should every app/website that uses Google oauth2 implement this? After all, it's optional. – Florian Walther Aug 20 '22 at 18:22
  • Hi, can I use CSRF token for state parameter? My application is already generating UUID as CSRF token, so while making OAuth request can I use the same UUID for state param. I have asked similar question https://stackoverflow.com/questions/74917368/can-i-use-csrf-token-as-value-for-state-parameter-in-oauth-flow – Amogh Dec 26 '22 at 07:30
0

state is echoed back in the query string sent to the redirect_uri. It has two purposes:

  1. The original use, as per its name is to transmit state information from the initiating webpage to the redirect_uri. For example, I have a process that sends users a link that allows them to link their account to some other resource. The link contains information (a token) that describes that resource. So the user clicks the link and they are taken to a web page, where they are redirected to the authentication server. I need that token to make its way through the authentication process back to the redirect_uri, so that the business logic behind the redirect_uri can finish linking the resources to the users account. I do this by using "state" to carry that information. Given "state" has no size limit, if can be used to carry all sorts of very useful information.

  2. As a means to protect against XSRF... It is very easy for a third party to forge a request to the authentication server and trick your redirect_uri into accepting the response. If the state information is a randomly generated value and the redirect_uri is in some way able to validate it as being its own, then you can protect yourself against this.

It is a misconception that the random value must be bound to or stored in some session. Though this is a valid way of verifying the state, it is flawed. If there is a delay in authentication (for example a user hanging in the login screen for several hours), then such a validation method would fail if the session had expired. It would also fail if the user had initiated multiple logins by mistake or by double clicking.

It is far better imho to use a private key to sign the random value. For example take the SHA1 or SHA256 hash of a random number + the private key as the signature. Make the state value a combination of the the random number and the signature. Noting that you would convert the binary value to Base64. It is common to use a full stop as a separator in the same manner as a JWT:

sate = random + "." + signature

where signature= BASE64_SHA256(random + privateKey)

You verify this by splitting the state value at the full stop as follows:

verifyValue = BASE64_SHA256(split(state,".")[0] + privateKey)

if verifyValue = split(state,".")[1] then the state is valid. No pesky session storage, so validation is much faster.

It is perfectly feasible to combine the two purposes into one. That is you can take the known value that you want your redirect_url to know about, add the random number and then add the signature as a SHA256 hash of both values. E.g:

sate = data + "." + random + "." + signature

where signature= BASE64_SHA256(data + "." + random + privateKey)

This is super secure because all the state information is fully signed and protected, which doesn't happen if you use a simple security token stored against the session (as per the method in https://developers.google.com/identity/openid-connect/openid-connect?hl=en#createxsrftoken)