Below code able to extract PE from the reuters link below. However, my method is not robust as the webpage for another stock has two lines lesser and result a shift of data. How can I encounter this problem. I would like to point straight to the part of PE to extract the data but do not know how to do it. link 1: http://www.reuters.com/finance/stocks/financialHighlights?symbol=MYEG.KL link 2: http://www.reuters.com/finance/stocks/financialHighlights?symbol=ANNJ.KL
from lxml import html
import lxml
page2 = requests.get('http://www.reuters.com/finance/stocks/financialHighlights?symbol=MYEG.KL')
treea = html.fromstring(page2.content)
tree4 = treea.xpath('//td[@class]/text()')
PE= tree4[37]
This is the part I wish that the code can extract only this part so that any changes of the webpage will not affected.
<tr class="stripe">
<td>P/E Ratio (TTM)</td>
<td class="data">36.79</td>
<td class="data">25.99</td>
<td class="data">21.70</td>
</tr>