This might be a completely foolish question, but google is to no avail. First of course importing the libraries I need:
from lxml import html
from lxml import etree
import requests
Simple enough. Now to run and parse some code. The link in this case is the weekly lunchmenu for a local restaurant. Here we prep the code for extracting our bits from it.
page = requests.get("http://www.farozon.se/lunchmeny-20207064")
tree = html.fromstring(page.text)
htmlparser = etree.HTMLParser()
tree2 = etree.parse(page.raw, htmlparser)
Now let's take a look at the menu! As you can see I am testing several different ways of getting the desired output.
friday = tree.cssselect("#block_82470858 > div > div > div.h24_frame_personal_text.h24_frame_padding > div > table > tbody > tr:nth-child(4)")
test = tree.xpath("/html/body")
Let's just print the output to see what we get.
print page
print tree.cssselect('#block_82470858 > div > div > div.h24_frame_personal_text.h24_frame_padding > div > table > tbody > tr:nth-child(4)')
print tree2
print friday
print test
Looking forward for to eat some... Wait, that aint food. The heck is that? In my attempt above, and in my IDE, I've tried Google's top 20 links for lxml and requests, they all output the same thing, but claim to output the actual html. I ain't got no clue what's going on.
<Response [200]>
[<Element tr at 0x30139f0>]
<lxml.etree._ElementTree object at 0x2db0dd0>
[<Element tr at 0x30139f0>]
[<Element body at 0x3013a48>]