0

Given the following URL:

   http://cisbp-rna.ccbr.utoronto.ca/TFreport.php?searchTF=T00022_0.6

This code has no problem parsing it:

from pyquery import PyQuery as pq
url= "http://cisbp-rna.ccbr.utoronto.ca/TFreport.php?searchTF=T00022_0.6"
page = pq(url)
for tb in page('table.tf_report').eq(0).items():
    print tb("tr").eq(4)("td").eq(0).text()

Which prints

 PF00642 (zf-CCCH) PF00098 (zf-CCHC) PF00076 (RRM_1)

But when I downloaded the page to my disk locally, it failed to parse it.

from pyquery import PyQuery as pq
# this is local HTML
url = "T00022_0.6.html"
page = pq(url)
for tb in page('table.tf_report').eq(0).items():
    print tb("tr").eq(4)("td").eq(0).text()

Which prints nothing.

The local file above can be downloaded here.

What's the right way to do it?

pdubois
  • 7,640
  • 21
  • 70
  • 99

1 Answers1

1

A local filename isn't a URL, even if you store it in a variable named url. Try:

page = pq(filename=url)

Alternatively, you could use an actual file: URL.

kindall
  • 178,883
  • 35
  • 278
  • 309