Questions tagged [pyquery]

pyquery is a jquery-like library for python that allows you to make jquery queries on xml documents.

PyQuery uses lxml for fast XML and HTML manipulation.

It allows you to make jQuery-style CSS-selector queries on XML/HTML documents. The API is intended to match jQuery's API whenever possible, though it has been made more Pythonic where appropriate

It can be used for many purposes. The main idea is to use it for templating with pure http templates that you modify using pyquery. I can also be used for web scrapping or for theming applications with Deliverance.

Read more

97 questions
0
votes
1 answer

python pyquery import not working on Mac OS Sierra

I'm trying to import pyquery as I did hundreds on time before, and it's not working. It looks like related to the Mac OS Sierra. (module installed with pip and up-to-date) from pyquery import PyQuery as pq And got an error on the…
Alex Pereira
  • 916
  • 1
  • 9
  • 17
0
votes
1 answer

Function to get a match for attributes in a list

I am trying to create a function to reduce what will be a lot of repeated code assigning to variables. Currently if I do this it works from pyquery import PyQuery as pq import pandas as pd d = pq(filename='20160319RHIL0_edit.xml') # from…
sayth
  • 6,696
  • 12
  • 58
  • 100
0
votes
1 answer

PyQuery - attr match only returns first match not all matches

When using pyquery I am not receiving every match for a selector just the first. Given this sample
sayth
  • 6,696
  • 12
  • 58
  • 100
0
votes
1 answer

pdfquery/PyQuery: example code shows no AttributeError but mine does...why?

I'm following the example code found here. The author has some documentation where he list some steps that used to write the program. When I run the whole program together it runs perfectly but when I follow the steps he's put I get an…
otteheng
  • 594
  • 1
  • 9
  • 27
0
votes
1 answer

Different Output From Same PyQuery Object

I am using scrapy in order to crawl a web site. with open('test.html', 'wb') as f: f.write(response.body) With this block I am writing body to a file. When I open the file I can see many "a" tag. When I print the same thing with print. It…
AnovaConsultancy
  • 106
  • 1
  • 13
0
votes
2 answers

Why am I receiving this error when trying to install pyquery python module?

I'm using the following command !pip install pyquery I'm able to install other modules. Does anyone know why this is happening? Thanks! src/lxml/includes/etree_defs.h:14:10: fatal error: 'libxml/xmlversion.h' file not found #include…
James Eaves
  • 1,587
  • 3
  • 17
  • 22
0
votes
1 answer

PyQuery find the sub element node text

Here is the code: from pyquery import PyQuery content = '''
Traceback (most recent call last):
\ File "./crawler.py", line…
lqhcpsgbl
  • 3,694
  • 3
  • 21
  • 30
0
votes
1 answer

Python Scrape website with Requests and lxml..

Using this as a starting point.. http://docs.python-guide.org/en/latest/scenarios/scrape/ from lxml import html import requests page = requests.get('http://econpy.pythonanywhere.com/ex/001.html') tree = html.fromstring(page.text) Everything works…
Merlin
  • 24,552
  • 41
  • 131
  • 206
0
votes
2 answers

Obtain the current value when parsing an html file with bokeh sliders

I am using bokeh to plot my math functions created with python/numpy. I would like to use sliders as shown in http://docs.bokeh.org/en/latest/docs/server_gallery/sliders_server.html Once I create the html file with the plot, I would like to…
geppo
  • 1
  • 4
0
votes
1 answer

Splitting scraped data via PyQuery

I have the following scenario:

one

two
three
four

five
six

I would like to yield ['one','two','three','four','five','six']. So far I have: import PyQuery as pq s = pq(html) list =…
bmikolaj
  • 485
  • 1
  • 5
  • 16
0
votes
1 answer

Parsing local versus online HTML page using PyQuery in Python

Given the following URL: http://cisbp-rna.ccbr.utoronto.ca/TFreport.php?searchTF=T00022_0.6 This code has no problem parsing it: from pyquery import PyQuery as pq url= "http://cisbp-rna.ccbr.utoronto.ca/TFreport.php?searchTF=T00022_0.6" page =…
pdubois
  • 7,640
  • 21
  • 70
  • 99
0
votes
1 answer

Ads messing up my article crawling

What do I need to do when trying to crawl an article, but an Ad of sorts keeps showing up? Specifically, the ones that would pop up in the middle of the screen, asking to log in/sign up, and you have to manually close it before reading. Because of…
fsbinesh
  • 21
  • 3
0
votes
1 answer

lxml/pyquery: parse in a less strict way

I am using PyQuery to process a large amount of documents from the Web. PyQuery uses lxml to parse the HTML documents. As a matter of fact, a lot of the documents are not valid HTML. As a consequence, those invalid documents cannot be successfully…
xiaohan2012
  • 9,870
  • 23
  • 67
  • 101
0
votes
1 answer

Using PyQuery to ask a webpage to search result by zip-code

I am newer to crawl data. Now I have to use Pyquery to crawl school info in the USA by zip-code in the website, http://www.greatschools.org/find-schools Each time I type in a zip-code, the URL of search page is very complicated. I think it is hard…
chenhao9255
  • 153
  • 1
  • 7
0
votes
2 answers

Python - Handling a javascript URL?

I am trying to download the html of a page that is requested through javascript and normally, by clicking a link in the browser. I can download the first page because it has a general URL: http://www.locationary.com/stats/hotzone.jsp?hz=1 But there…
Marcus Johnson
  • 2,505
  • 6
  • 22
  • 27