65

How do you pass a csrftoken with the python module Requests? This is what I have but it's not working, and I'm not sure which parameter to pass it into (data, headers, auth...)

import requests
from bs4 import BeautifulSoup

URL = 'https://portal.bitcasa.com/login'

client = requests.session(config={'verbose': sys.stderr})

# Retrieve the CSRF token first
soup = BeautifulSoup(client.get('https://portal.bitcasa.com/login').content)
csrftoken = soup.find('input', dict(name='csrfmiddlewaretoken'))['value']

login_data = dict(username=EMAIL, password=PASSWORD, csrfmiddlewaretoken=csrftoken)
r = client.post(URL, data=login_data, headers={"Referer": "foo"})

Same error message every time.

<h1>Forbidden <span>(403)</span></h1>
<p>CSRF verification failed. Request aborted.</p>
Jeff
  • 6,932
  • 7
  • 42
  • 72
  • What does `r.text` return? Still `CSRF verification failed`? I see the form also has a `next` field (defaults to `/`), maybe that needs to be added? Doublecheck what is posted when you do it manually. – Martijn Pieters Nov 26 '12 at 15:10
  • @MartijnPieters yes `CSRF verification failed. Request aborted.` – Jeff Nov 26 '12 at 15:12
  • Doing it manually, I see the next field has / as well. – Jeff Nov 26 '12 at 15:15
  • What else is posted? Just `username`, `password`, `csrfmiddlewaretoken` and `next`? Or are there other fields in addition? What happens when you add `next='/'` to your `login_data` dictionary? – Martijn Pieters Nov 26 '12 at 15:57
  • That's everything that's posted. Setting `next='/'` gives the same error. – Jeff Nov 26 '12 at 16:35
  • Wait a sec, what is `URL` set to? – Martijn Pieters Nov 26 '12 at 16:38
  • `URL = 'https://portal.bitcasa.com/login'` – Jeff Nov 26 '12 at 16:38
  • Note: you can skip the whole beautifulsoup parsing and just take the csrf token from the cookie; do run the `client.get` but don't parse, just use `value = client.cookies['csrftoken']` instead. Otherwise, no clue. – Martijn Pieters Nov 26 '12 at 16:49
  • I've updated my [previous answer](http://stackoverflow.com/questions/13553249/python-requests-login-to-website-returns-403) to remove the BeautifulSoup page altogether; the cookie is easier and faster to retrieve. – Martijn Pieters Nov 26 '12 at 16:55
  • I certainly see the same error (created an account). It's not the token that makes this fail, it's the referrer I think. – Martijn Pieters Nov 26 '12 at 17:13
  • Yeah, just figured it out. I changed the Referer to the url and it worked magically. Not sure why though. I'll have to read up on that. Thank you so much for you help Martijn. – Jeff Nov 26 '12 at 17:24
  • 1
    Because the [CSRF checking code](https://github.com/django/django/blob/master/django/middleware/csrf.py) first checks the referrer, then the CSRF token. I thought the error message would be visible, but it's not shown unless the server is in debug mode, which is what threw me at first as to why the code wasn't working. Then I tried it myself, saw the same error and went back to the referrer, which *must* match the hostname. – Martijn Pieters Nov 26 '12 at 17:27

2 Answers2

119

If you are going to set the referrer header, then for that specific site you need to set the referrer to the same URL as the login page:

import sys
import requests

URL = 'https://portal.bitcasa.com/login'

client = requests.session()

# Retrieve the CSRF token first
client.get(URL)  # sets cookie
if 'csrftoken' in client.cookies:
    # Django 1.6 and up
    csrftoken = client.cookies['csrftoken']
else:
    # older versions
    csrftoken = client.cookies['csrf']

login_data = dict(username=EMAIL, password=PASSWORD, csrfmiddlewaretoken=csrftoken, next='/')
r = client.post(URL, data=login_data, headers=dict(Referer=URL))

When using unsecured http, the Referer header is often filtered out and otherwise easily spoofable anyway, so most sites no longer require the header to be set. However, when using an SSL connection and if it is set, it does make sense for the site to validate that it at least references something that could logically have initiated the request. Django does this when the connection is encrypted (uses https://), and actively requires it then.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
4

Similarly, using django's csrf_client note the primary difference is using csrftoken.value in the login_data. Tested with Django 1.10.5 --

import sys

import django
from django.middleware.csrf import CsrfViewMiddleware, get_token
from django.test import Client

django.setup()
csrf_client = Client(enforce_csrf_checks=True)

URL = 'http://127.0.0.1/auth/login'
EMAIL= 'test-user@test.com'
PASSWORD= 'XXXX'

# Retrieve the CSRF token first
csrf_client.get(URL)  # sets cookie
csrftoken = csrf_client.cookies['csrftoken']

login_data = dict(username=EMAIL, password=PASSWORD, csrfmiddlewaretoken=csrftoken.value, next='/')
r = csrf_client.post(URL, data=login_data, headers=dict(Referer=URL))
storm_m2138
  • 2,281
  • 2
  • 20
  • 18