lxml / xpath - limiting output

Question

I'm working on a Python 2.7 script to extract dates from a website. Code is as follows:

from lxml import html, etree
from urllib2 import urlopen
import requests

url = 'http://www.cardiffdevils.com/fixtures/'
newtree = etree.HTML(urlopen(url).read())

for section in newtree.xpath('//div[@class="month"]'):
    print section.xpath('h3[1]/text()')
    print section.xpath('//td[@class="date"]/text()')

The months are being output correctly, but I'm trying to limit the dates printed for each section to only those found within the corresponding "month" class; at the moment it spits out all the dates it finds in the whole page. Any pointers would be appreciated!

score 0 · Accepted Answer · edited May 23 '17 at 10:27

0

Start your XPath with a period (.) to make it relative to the context element :

print section.xpath('.//td[@class="date"]/text()')

Last question I answered regarding this problem (different language, different XPath processor) : Foreach not iterating through elements

edited May 23 '17 at 10:27

Community

1
1

answered Sep 29 '16 at 08:12

har07

88,338
12
84
137

You're a scholar and (I assume) a gentleman, that's exactly what I needed. Thank you! – MattN Sep 29 '16 at 08:24

lxml / xpath - limiting output

1 Answers1