0

I'm working on a Python 2.7 script to extract dates from a website. Code is as follows:

from lxml import html, etree
from urllib2 import urlopen
import requests

url = 'http://www.cardiffdevils.com/fixtures/'
newtree = etree.HTML(urlopen(url).read())

for section in newtree.xpath('//div[@class="month"]'):
    print section.xpath('h3[1]/text()')
    print section.xpath('//td[@class="date"]/text()')

The months are being output correctly, but I'm trying to limit the dates printed for each section to only those found within the corresponding "month" class; at the moment it spits out all the dates it finds in the whole page. Any pointers would be appreciated!

MattN
  • 5
  • 4

1 Answers1

0

Start your XPath with a period (.) to make it relative to the context element :

print section.xpath('.//td[@class="date"]/text()')

Last question I answered regarding this problem (different language, different XPath processor) : Foreach not iterating through elements

Community
  • 1
  • 1
har07
  • 88,338
  • 12
  • 84
  • 137