Goal:
The goal is to generate a robot in replit which will iteratively scrape yahoo pages like this amazon page, and track the dynamic 'Volume' datapoint for abnormally large changes. I'm currently trying to be able to reliably pull this exact datapoint down, and I have been using the yahoo_fin API to do so. I have also considered using bs4, but I'm not sure if it is possible to use BS4 to extract dynamic data. (I'd greatly appreciate it if you happen to know the answer to this: can bs4 extract dynamic data?)
Problem:
The script seems to work, but it does not stay online due to what appears to be an error in yahoo_fin. Usually within around 5 minutes of turning the bot on, it throws the following error:
File "/home/runner/goofy/scrape.py", line 13, in fetchCurrentVolume
table = si.get_quote_table(ticker)
File "/opt/virtualenvs/python3/lib/python3.8/site-packages/yahoo_fin/stock_info.py", line 293, in get_quote_table
tables = pd.read_html(requests.get(site, headers=headers).text)
File "/opt/virtualenvs/python3/lib/python3.8/site-packages/pandas/util/_decorators.py", line 311, in wrapper
return func(*args, **kwargs)
File "/opt/virtualenvs/python3/lib/python3.8/site-packages/pandas/io/html.py", line 1098, in read_html
return _parse(
File "/opt/virtualenvs/python3/lib/python3.8/site-packages/pandas/io/html.py", line 926, in _parse
raise retained
File "/opt/virtualenvs/python3/lib/python3.8/site-packages/pandas/io/html.py", line 906, in _parse
tables = p.parse_tables()
File "/opt/virtualenvs/python3/lib/python3.8/site-packages/pandas/io/html.py", line 222, in parse_tables
tables = self._parse_tables(self._build_doc(), self.match, self.attrs)
File "/opt/virtualenvs/python3/lib/python3.8/site-packages/pandas/io/html.py", line 552, in _parse_tables
raise ValueError("No tables found")
ValueError: No tables found
However, this usually happens after a number of tables have already been found.
Here is the fetchCurrentVolume
function:
import yahoo_fin.stock_info as si
def fetchCurrentVolume(ticker):
table = si.get_quote_table(ticker)
currentVolume = table['Volume']
return currentVolume
and the API documentation is found above under Goal. Whenever this error message is displayed, the bot exits a @tasks.loop
, and the robot goes offline. If you know of a way to fix the current use of yahoo_fin, OR any other way to obtain the dynamic data found in this xpath: '//div[@id="quote-summary"]/div/table/tbody/tr'
, then you will have pulled me out of a 3 weeks long debacle with this issue! Thank you.