I have the following sample HTML table from a html file.
<table>
<tr>
<th>Class</th>
<th class="failed">Fail</th>
<th class="failed">Error</th>
<th>Skip</th>
<th>Success</th>
<th>Total</th>
</tr>
<tr>
<td>Regression_TestCase.RegressionProject_TestCase2.RegressionProject_TestCase2</td>
<td class="failed">1</td>
<td class="failed">9</td>
<td>0</td>
<td>219</td>
<td>229</td>
</tr>
<tr>
<td><strong>Total</strong></td>
<td class="failed">1</td>
<td class="failed">9</td>
<td>0</td>
<td>219</td>
<td>229</td>
</tr>
</table>
I am trying to print the text from the <td>
tags where <td>
starts from:
Regression_TestCase.RegressionProject_TestCase2.RegressionProject_TestCase2
I do not want to include the text from the <td>
tags where <td>
starts from:
<td><strong>Total</strong></td>
My code is printing the text from every single <td>
tag:
def extract_data_from_report():
html_report = open(r"E:\SeleniumTestReport.html",'r').read()
soup = BeautifulSoup(html_report, "html.parser")
th = soup.find_all('th')
td = soup.find_all('td')
for item in th:
print item.text,
print "\n"
for item in td:
print item.text,
My desired output:
Class Fail Error Skip Success Total
Regression_TestCase 1 9 0 219 229