5

I am getting a different response from python and curl, although each uses the exact same parameters.

Python:

import requests

headers = {
    'Accept-Language': 'en-US,en',
    'Accept': 'text/html,application/xhtml+xml,application/xml',
    'Authority': 'www.google.com',
    'User-Agent': 'SomeAgent',
    'Upgrade-Insecure-Requests': '1',
}

response = requests.get('https://www.avvo.com', headers=headers)
# Returns a 403 response

Curl:

import shlex, subprocess
cmd = '''curl -H 'Accept-Language: en-US,en' -H 'Accept: text/html,application/xhtml+xml,application/xml' -H 'Authority: www.google.com' -H 'User-Agent: SomeAgent' -H 'Upgrade-Insecure-Requests: 1' https://www.avvo.com'''
args = shlex.split(cmd)
process = subprocess.Popen(args, shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdout, stderr = process.communicate()
# Returns a 200 response

Both requests are being sent from the same IP. It looks like it's a cloudflare issue, is there any way cloudflare can detect a request coming from the python requests library vs a direct curl command?

I left the website in the code in case its useful to run. Here is the curl command directly:

curl -H 'Accept-Language: en-US,en' -H 'Accept: text/html,application/xhtml+xml,application/xml' -H 'Authority: www.google.com' -H 'User-Agent: SomeAgent' -H 'Upgrade-Insecure-Requests: 1' https://www.avvo.com/administrative-law-lawyer/ny.html
superdee
  • 637
  • 10
  • 23
  • Possible duplicate of [Curl and Python Requests (get) reporting different http status code](https://stackoverflow.com/questions/51268405/curl-and-python-requests-get-reporting-different-http-status-code) – Niloct Aug 02 '19 at 00:03
  • Both methods returned 403 when I tried. i thought it may be caused by captcha configured to challenge unusual user-agent, but did not make a difference when I spoofed the user-agent to a legit one, so possibly there's other parameters put in place by the site owner – FaizAzhar Aug 03 '19 at 05:51

0 Answers0