As a novice with bs4 I'm looking for some help in working out how to extract the text from a series of webpage tables, one of which is like this:
<table style="padding:0px; margin:1px" width="715px">
<tr>
<td height="22" width="33%" >
<span class="darkGreenText"><strong> Name: </strong></span>
Tyto alba
</td>
<td height="22" width="33%" >
<span class="darkGreenText"><strong> Order: </strong></span>
Strigiformes
</td>
<td height="22" width="33%">
<span class="darkGreenText"><strong> Family: </strong></span>
Tytonidae
</td>
<td height="22" width="66%" colspan="2">
<span class="darkGreenText"><strong> Status: </strong></span>
Least Concern
</td>
</tr>
</table>
Desired output:
Name: Tyto alba
Order: Strigiformes
Family: Tytonidae
Status: Least Concern
I've tried using [index]
as recommended (https://stackoverflow.com/a/35050622/1726290),
and also next_sibling
(https://stackoverflow.com/a/23380225/1726290) but I'm getting stuck as one part of the text I need is tagged and the second part is not. Any help would be appreciated.