I have been trying to extract a table but it retrieves only the heading of the table. This is my first way to retrieve the table.
url = r"https://www.sec.gov/edgar/search/#/q=Women&dateRange=custom&entityName=Infosys&startdt=2010-03-01&enddt=2020-03-01"
r = requests.get(url)
soup = BeautifulSoup(r.content, 'html.parser')
table = soup.find_all("table")[1]
#Extracting heading of the columns of the table.
rows = table.find_all('tr')
columns=[]
headings = rows[0].find_all('th')
for col in headings:
columns.append(col.text.strip())
print(columns)
#Extracting all data of the table row wise.
all_data=[]
for row in rows[1:]:
data = row.find_all('td')
lst=[]
for d in data:
lst.append(d.text.strip())
all_data.append(lst)
#Creating the dataframe out of the extracted data.
ds = pd.DataFrame(all_data, columns=columns)
ds
Second way:
ds1 = pd.read_html(url)[0]
ds1
When I tried to search the table, I get all the columns heading in the thead tag, but I get an empty tbody.
table = soup.find_all("table", class_='table')
table
Output:
[<table class="table table-hover entity-hints" id="asdf"></table>,
<table class="table">
<thead>
<tr>
<th class="filetype" id="filetype">Form & File</th>
<th class="filed">Filed</th>
<th class="enddate">Reporting for</th>
<th class="entity-name">Filing entity/person</th>
<th class="cik">CIK</th>
<th class="located">Located</th>
<th class="incorporated">Incorporated</th>
<th class="file-num">File number</th>
<th class="film-num">Film number</th>
</tr>
</thead>
<tbody>
</tbody>
</table>]
Why the tbody tag is empty?
Sceenshot of table: