I'm currently trying to get data from an html file. It appears that the code I'm using works, but not as I expect. I can get some items but not all and I'm wondering if it has to do with the size of the file I'm attempting to read.
I'm currently trying to parse the source of this webpage.
This page is 4500 lines long so it is a pretty good size. I've been using this page as I'd like to make sure the code works on large files.
The code I'm using is:
import lxml.html
import lxml
import urllib2
webHTML = urllib2.urlopen('http://hobbyking.com/hobbyking/store/__39036__Turnigy_Multistar_2213_980Kv_14Pole_Multi_Rotor_Outrunner.html').read()
webHTML = lxml.html.fromstring(webHTML)
productDetails = webHTML.get_element_by_id('productDetails')
for element in productDetails:
print element.text_content()
This gives the expected output when I use an element_id of 'mm3' or something near the top but if I use the element_id of 'productDetails' I get no output. At least I do on my current setup.