I am trying to use the aynscio
and aiohttp
packages to request a web page. However, the web page response is:
<p class="warning-title"> Please upgrade your web browser. </p> <br/>
<p class="p-top-30">This browser version is outdated, and may not be fully compatible with our website. Please upgrade to a newer version or use another browser. </p>
It doesn't actually load the page I'm trying to access but the homepage instead.
CODE
from fake_useragent import UserAgent
import ssl
from bs4 import BeautifulSoup
import asyncio
import aiohttp
ua = UserAgent()
hdr = {'User-Agent': str(ua.chrome),
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.3',
'Accept-Encoding': 'none',
'Accept-Language': 'en-US,en;q=0.5',
'Connection': 'keep-alive'}
ssl_ctx = ssl.create_default_context()
ssl_ctx.check_hostname = False
ssl_ctx.verify_mode = ssl.CERT_NONE
url = '...'
async def parse_website(session):
async with session.get(url) as response:
html = await response.text()
soup = BeautifulSoup(html, 'html.parser')
print(soup)
async with asyncio.Semaphore(3):
async with aiohttp.TCPConnector(ssl=ssl_ctx, limit=None) as connector:
async with aiohttp.ClientSession(connector=connector, headers=hdr) as session:
for i in range(1):
await parse_website(session)
I have tried not including the headers
argument in the third to last line async with aiohttp.ClientSession(connector=connector) as session:
but then the response is that I didn't wait long enough for the captcha. So I have to use the headers
argument to bypass the captcha but I consistently get a Please upgrade your browser
response. I also tried adding cookies={}
to the same line async with aiohttp.ClientSession(connector=connector, headers=hdr, cookies={}) as session:
but get the same original response saying the browser is out of date.
I'm also only showing one url request here. Once I have this working I'll scale to thousands, so that's why I'm trying to make this work with the asyncio
and aiohttp
packages.
Could someone tell me where I'm going wrong here?