0

This is the HTML code of site I want to scrape:

<div id="quranOutput">
  <a class="key" name="1:1"></a>
    <div class="verse ayahBox1" id="verse_1">

this is the xpath im using in dynamic django scraper but its not working:

//div[@class="ayah language_6 text"]/a/@name

Can someone help me out what will be the correct way to retrieve the name i.e. (name="1:1").

jh314
  • 27,144
  • 16
  • 62
  • 82

1 Answers1

1

Use xpath:

//div[@id="quranOutput"]/a[@class="key"]/@name

>>> import lxml.html
>>> 
>>> root = lxml.html.fromstring('''
... <html>
...     <body>
...         <div id="quranOutput">
...             <a class="key" name="1:1"></a>
...             <div class="verse ayahBox1" id="verse_1"></div>
...         </div>
...     </body>
... </html>''')
>>> 
>>> print root.xpath('//div[@id="quranOutput"]/a[@class="key"]/@name')
['1:1']
falsetru
  • 357,413
  • 63
  • 732
  • 636