pyquery returns [None] when opening file

Question

If I open an html file base_result.htm with pyquery, it returns [None], and throws errors when I search it. If I use that same file as a string, everything works well.

>>> d = PyQuery(filename = 'base_result.html')
>>> d
[None]
>>> f = open('base_result.html')
>>> d = PyQuery(f.read())
>>> d
[<html>]

Is this the documented behaviour? I have two identical files, one online and one local, but the parsing for 'url = ', and 'filename = ' is different. — maged, Aug 06 '13 at 18:30
I stand corrected; I can't see why it would return `None` (though if the parsing for `url=` and `filename=` were meant to be the same, they wouldn't need two separate keywords!). But yeah, I don't know how you're getting a None return value. Are you sure you have the latest version? — Henry Keiter, Aug 06 '13 at 18:39
Yeah it's the latest (from https://github.com/gawel/pyquery). Two keywords makes sense, because to load an html file from a url, and from a file path requires different python functions. I guess it could have been parsed though. — maged, Aug 06 '13 at 18:43

score 1 · Accepted Answer · answered Aug 07 '13 at 16:43

Its an open issue in PyQuery: https://github.com/gawel/pyquery/issues/22

Some workarounds are mentioned in above link, such as:

>>> from lxml.html import parse
>>> parse("index.html")
<lxml.etree._ElementTree object at 0x108a72f38>
>>> pq(parse("index.html").getroot())

or

>>> f = open('index.html')
>>> d = PyQuery(f.read())

pyquery returns [None] when opening file

1 Answers1