web scrap stock data from Reuters

Question

I am a programming beginner and trying to extract key metric data (e.g. Beta) for a stock from Reuters. However, it always come back as blank.

my codes are like this:

from bs4 import BeautifulSoup as bs
import requests
import re

url = 'https://www.reuters.com/markets/companies/TSLA.OQ/key-metrics/price-and-volume'
page = requests.get(url)
bs1 = bs(page.text, 'html.parser')

beta=bs1.find_all('th', class_ ='text__text__1FZLe text__dark-grey__3Ml43 text__regular__2N1Xr text__body__yKS5U body__base__22dCE body__body__VgU9Q',text=re.compile('Beta'))
print(beta)

I know it is not correct but I cannot figure out what to do. please help. Ultimate I want to be extract the Beta info for a stock from Reuters. thank you for your help!!!

Sam · Answer 1 · 2022-07-05T10:44:48.747

0

Here's one way of collecting the data you need:

from bs4 import BeautifulSoup as bs 
import requests
import re

url = 'https://www.reuters.com/markets/companies/TSLA.OQ/key-metrics/price-and-volume'
page = requests.get(url)
soup = bs(page.text, 'html.parser')

# Locate the Table you wish to scrape
table = soup.select_one('table.table__table__2px_A')

# Locate the Keys and Value for each of the rows
keys = [i.text for i in table.select('tr th') if i]
values = [i.text for i in table.select('tr td') if i]

# Convert the two lists into a dictionary for a neater output
data = dict(zip(keys,values))

This will return:

{'% Change': '671.00',
 'Brent Crude Oil': '-1.40%Negative',
 'CBOT Soybeans': '1,626.00',
 'Copper': '111.91',
 'Future': '1,805.20',
 'Gold': '-0.57%Negative',
 'Last': '+0.35%Positive'}

edited Jul 05 '22 at 10:44

answered Jul 05 '22 at 10:38

Sam

533
3
12

Thanks for your response. I am trying to extract the key metrics (such as Beta, 52 Week High etc.) which are inside the table. The current return gives the data that are outside of that table titled "price and volume". Can you please help? thank you ! – YY789 Jul 06 '22 at 10:32
Sorry I just looked into the html, the table containing the data you want isn't loaded in the HTML. – Sam Jul 06 '22 at 11:46
oh I see. no wonder I can't seem to extract it. It seems more complicated than I thought. Anyways, thanks for responding!! – YY789 Jul 06 '22 at 13:36
I've had a look. you can find the data in the script tag, it's loaded via javascript. If you wanted to spend the time looking through the script to then load in the json (containing the information you want) you coul extract it. Another way is using Selenium, it's slower than requests but it allows you to scrape content loaded in JS – Sam Jul 06 '22 at 13:53

Sam · Accepted Answer · 2022-07-07T08:11:48.467

You can scrape the site (without inspecting the javascript/json) using Selenium, using bs4 from my previous answer but you can use seleniums functions instead.

from selenium import webdriver
from bs4 import BeautifulSoup as bs


# Initiate webdriver
driver = webdriver.Firefox()

# Fetch the web page
driver.get('https://www.reuters.com/markets/companies/TSLA.OQ/key-metrics/price-and-volume')

# Convert the driver page source to a soup object
soup = bs(driver.page_source, 'html.parser')

# Find the table you want to scrape
table = soup.find('table', attrs={'aria-label':'KeyMetrics'})

# Locate the Keys and Value for each of the rows
keys = [i.text for i in table.select('tbody tr th') if i]
values = [i.text for i in table.select('tbody tr td') if i]

# Convert the two lists into a dictionary for a neater output
data = dict(zip(keys,values))

driver.quit()
print(data)

This will return:

{'Price Closing Or Last Bid': '699.20', 'Pricing Date': 'Jul 05', '52 Week High': '1,243.25', '52 Week High Date': 'Nov 04', '52 Week Low': '620.50', '52 Week Low Date': 'Jul 08', '10 Day Average Trading Volume': '31.36', '3 Month Average Trading Volume': '602.72', 'Market Capitalization': '724,644.30', 'Beta': '2.13', '1 Day Price Change': '2.55', '5 Day Price Return (Daily)': '-4.84', '13 Week Price Return (Daily)': '-35.93', '26 Week Price Return (Daily)': '-39.18', '52 Week Price Return (Daily)': '2.99', 'Month To Date Price Return (Daily)': '3.83', 'Year To Date Price Return (Daily)': '-33.84', 'Price Relative To S&P500 (4 Week)': '5.95', 'Price Relative To S&P500 (13 Week)': '-24.33', 'Price Relative To S&P500 (26 Week)': '-23.90', 'Price Relative To S&P500 (52 Week)': '16.99', 'Price Relative To S&P500 (YTD)': '-17.69'}

wow...amazing!! It worked except one minor bug: the key and the value don't align to each other. e.g. BETA should be 2.13 but it shows as -4.84 (the value for 5day value return). — YY789, Jul 06 '22 at 16:55
I've updated my answer, this should work now. I realised I was taking the Table Headers in (Title,Value). I've removed these by only searching in the table body. — Sam, Jul 07 '22 at 08:12
Thank you very much!! You are so helpful and so good with python!! — YY789, Jul 12 '22 at 08:47
Can you mark the above as the answer please so it closes this question — Sam, Jul 12 '22 at 09:50

web scrap stock data from Reuters

2 Answers2