In the interests of educational web-scraping I'm trying to parse some html but something I can't explain is happening. When I view the page's source code via developer tools in Chrome I see
<link rel="preload" href="/static/bundles/es6/Consumer.js/54df4d9114a3.js" as="script" type="text/javascript" crossorigin="anonymous" />
but when I load it in Python via requests.get(url, headers)
I get
<link rel="preload" href="/static/bundles/metro/Consumer.js/54df4d9114a3.js" as="script" type="text/javascript" crossorigin="anonymous" />
The difference is es6
is metro
. What may be causing this? What could cause the same url to return different static html?
I'm using an identical User-Agent string to what's shown in Dev Tools, so I suspect I could be missing some other header information.
headers = {
'Access-Control-Allow-Origin': '*',
'Access-Control-Allow-Methods': 'GET',
'Access-Control-Allow-Headers': 'Content-Type',
'Access-Control-Max-Age': '3600',
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.111 Safari/537.36'
}
html_content = requests.get(url, headers).text
I'm aware of this question, with an almost identical title, Requests.get showing different HTML than Chrome's Developer Tool, but it doesn't answer the question and I don't want to use Selenium or a Web Driver. I'm after speed.