I'm trying to scrape some numbers off of the TD Ameritrade website with urllib.request and beautiful soup, but I think that the website has some sort of program that changes the numbers to incorrect ones to prevent web scraping. For example, when I try to parse the next earnings date from the url 'https://research.tdameritrade.com/grid/public/research/stocks/earnings?symbol=goog', it returns "(Unconfirmed) July 25, 2022", when the earnings date displayed on the website's HTML file is "July 26, 2022".
Is this true, or is there something just wrong with my code? Is there any way to get around this?
from urllib.request import Request,urlopen
from bs4 import BeautifulSoup as soup
url = 'https://research.tdameritrade.com/grid/public/research/stocks/earnings?symbol=goog'
request_site = Request(url)
page_html = urlopen(request_site).read()
page_soup = soup(page_html, "html.parser")
earnings = page_soup.findAll("td", {"class": "value week-of"})
earnings = earnings[0].text
print(earnings)