Below is the code. I've also attached a photo of the table here.
The issue I am having is tr[1]
has the column headers or th
tags, then tr[1:40]
has row headers with th
tags followed by td
and th
tags that correspond to the numbers in the table. td
are normal numbers, but th
are different because they have a color scheme.
I want to iterate through tr[1]
to set the colheaders, then set the th
tags inside the tr
tags from there on as the row headers. Any information would be much appreciated!
import ssl
from urllib2 import Request, urlopen
from bs4 import BeautifulSoup
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE
req = Request('https://www.mrci.com/special/corr030.php',headers={'User-Agent': 'Mozilla/5.0'})
webpage = urlopen(req, context = ctx).read()
soup = BeautifulSoup(webpage, 'lxml') # Parse the HTML as a string
table = soup.find_all('table')[2] # Grab the correlation table
for row in table.find_all('tr')[1:]:
print(row)