The ESG information you want is inside a <script>
tag that is stored as JSON inside the HTML that is returned.
First you need to first locate the correct script tag and then extract it, i.e. one containing the JSON text. find()
can be used to locate the start and end of the JSON.
The JSON can then be converted into a Python data structure using Python json.loads()
function.
All of the data can now be accessed using standard Python list/dictionary type notation. The ESG scores are buried quite deep inside the structure. I would recommend first printing out just the JSON and using an online tool to format it. There are then tools which can show you the 'path' to access any item in the JSON.
The ESG scores can then be accessed as follows:
from bs4 import BeautifulSoup
import requests
import json
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.47 Safari/537.36'}
req = requests.get("https://uk.finance.yahoo.com/quote/XOM/sustainability?p=XOM&.tsrc=fin-srch", headers=headers)
soup = BeautifulSoup(req.content, 'html.parser')
for script in soup.find_all('script'):
if script.string and "root.App.main" in script.string:
f1 = script.string.find("root.App.main = ")
f2 = script.string.find("\n", f1)
data = json.loads(script.string[f1+16:f2-1])
esg_scores = data['context']['dispatcher']['stores']['QuoteSummaryStore']['esgScores']
for key, value in esg_scores.items():
print(f"{key:40} {value}")
break
This shows the available ESG data as:
palmOil False
peerSocialPerformance {'min': 2.04, 'avg': 10.441525423728814, 'max': 19.64}
controversialWeapons False
ratingMonth 5
gambling False
socialScore {'raw': 9.82, 'fmt': '9.8'}
nuclear False
furLeather False
alcoholic False
gmo False
catholic False
socialPercentile None
peerGovernancePerformance {'min': 4.73, 'avg': 8.467627118644069, 'max': 13.63}
peerCount 66
relatedControversy ['Operations Incidents']
governanceScore {'raw': 8.14, 'fmt': '8.1'}
environmentPercentile None
animalTesting True
peerEsgScorePerformance {'min': 8.24, 'avg': 37.78166666666667, 'max': 58.64}
tobacco False
totalEsg {'raw': 36.46, 'fmt': '36.5'}
highestControversy 3
esgPerformance OUT_PERF
coal False
peerHighestControversyPerformance {'min': 0, 'avg': 2.0454545454545454, 'max': 5}
pesticides False
adult False
ratingYear 2022
maxAge 86400
percentile {'raw': 81.8, 'fmt': '82'}
peerGroup Oil & Gas Producers
smallArms False
peerEnvironmentPerformance {'min': 0.12, 'avg': 18.42101694915254, 'max': 26.75}
environmentScore {'raw': 18.51, 'fmt': '18.5'}
governancePercentile None
militaryContract False
I suggest you print(data)
to better understand all of the information available.