I'm scraping this site, specifically the content of the tables inside the div
tags with class
containing 'ranking-data'
. So for the first td
that would be:
//div[contains(@class, 'ranking-data')]//tr[th//text()[contains(., 'TIN')]]/td[1]/text()"
This is working fine for all columns in all tables (with needed modifications) except for a cell in column 2 that contains an i
tag: on Google Spreadsheets it adds an extra blank cell below the cell with the text itself. I've first tried to scrap it with:
//div[contains(@class, 'ranking-data')]//tr[th//text()[contains(., 'TIN')]]/td[2]/text()
Then I've tried something like *[not(i[contains(@class,'info-circle')])]/text()
after the td[2]
, and some other variants, but it doesn't work.
How can I avoid this i
tag?