2

Introduction

Hello,

I have a web application where the frontend makes a request to the backend and the backend responds by setting a cookie on the browser (for a session). In addition, a CSRF token is attached to the header of the response.


Problem

Once the initial request is complete, the browser has the cookie set and the CSRF token is present in the frontend. I would at times get a response from the backend that says the CSRF session token is missing. I thought that was strange since in the network tab of the developers tools, I saw the cookie and CSRF token in the request.


Investigation

I have previously read this article on Flask contexts and I'd figure this might have some information that relates to the problem I'm experiencing.

Based on my current setup, I have the following:

  • Web server: Heroku
  • WSGI server: Gunicorn
  • REST API: Flask application

My WSGI server spawns 2 workers (Flask application) to handle all the requests coming in from the frontend.

Code that handles CSRF protection endpoint

@app.route('/get-csrf', methods=['GET'])
def get_csrf():
    token = generate_csrf() 
    ...
    response.headers.set('X-CSRFToken', token)
    return response

Code for generate_csrf

def generate_csrf(secret_key=None, token_key=None):
    """Generate a CSRF token. The token is cached for a request, so multiple
    calls to this function will generate the same token.

    During testing, it might be useful to access the signed token in
    ``g.csrf_token`` and the raw token in ``session['csrf_token']``.

    :param secret_key: Used to securely sign the token. Default is
        ``WTF_CSRF_SECRET_KEY`` or ``SECRET_KEY``.
    :param token_key: Key where token is stored in session for comparison.
        Default is ``WTF_CSRF_FIELD_NAME`` or ``'csrf_token'``.
    """

    secret_key = _get_config(
        secret_key,
        "WTF_CSRF_SECRET_KEY",
        current_app.secret_key,
        message="A secret key is required to use CSRF.",
    )
    field_name = _get_config(
        token_key,
        "WTF_CSRF_FIELD_NAME",
        "csrf_token",
        message="A field name is required to use CSRF.",
    )

    if field_name not in g:
        s = URLSafeTimedSerializer(secret_key, salt="wtf-csrf-token")

        if field_name not in session:
            session[field_name] = hashlib.sha1(os.urandom(64)).hexdigest()

        try:
            token = s.dumps(session[field_name])
        except TypeError:
            session[field_name] = hashlib.sha1(os.urandom(64)).hexdigest()
            token = s.dumps(session[field_name])

        setattr(g, field_name, token)

    return g.get(field_name)

My understanding is that once the /get-csrf endpoint is handled, Flask will create an application context (and request context) which:

Keeps track of the application-level data (configuration variables, logger, database connection)

The application context can then be accessed through the g and current_app proxies.

Then once the request is complete, those contexts are then destroyed.

Within the function generate_csrf, a session is created and then g is assigned the value of the CSRF token. My understanding is that g will only be available for the lifetime of the request and then it's gone. That is fine for the purposes of generating a token.

However, the main issue (to me) is session within the request context. The documentation doesn't really explain much how it works.

For example, in generate_csrf a session is created, then a token is stored in g. After the request is complete, g is basically gone. But what happens to session?

Below are some logs I have in production:

[9] [DEBUG] GET /get-csrf

[10] [DEBUG] POST /login-session
[10] [DEBUG] session cookie sent
[10] [DEBUG] field_name: csrf_token
[10] [INFO] The CSRF session token is missing.

[9] [DEBUG] POST /login-session
[9] [DEBUG] session cookie sent
[9] [DEBUG] field_name: csrf_token
[9] [DEBUG] Session key: csrf_token

From above [9] and [10] represents two separate workers. The CSRF token is generated from [9] and session cookie is sent to browser. Then the next request at /login-session gets handled by [10]. However, here the session is not available. After a second request, [9] handles it and the session is present.

This leads me to believe that sessions are stored per worker. How would one work around this?

Update - May 4, 2022

I came across this post and then I realized that I had something similar to the following in my code:

app.config['SECRET_KEY'] = os.environ.get('SECRET_KEY') + os.urandom(50)

Correct me if I'm wrong, but I presume that each time a worker is spawned, the previous secret the worker used is now invalid.

I updated my secret key configuration to the following:

app.config['SECRET_KEY'] = os.environ.get('SECRET_KEY') 

It seems to have fixed it, but I still need to verify this.

jeff
  • 490
  • 1
  • 6
  • 21

0 Answers0