use pyquery to filter html

Question

I'm trying to use pyquery parse html. I'm facing one uncertain issue. My code as below:

from pyquery import PyQuery as pq
document = pq('<p id="hello">Hello</p><p id="world">World !!</p>')
p = document('p')
print(p.filter("#hello"))

And the expectation of print result should as following :

<p id="hello">Hello</p>

But the actual response as below:

<p id="hello">Hello</p><p id="world">World !!</p></div></html>

if I just want to the specify part html instead of the rest of the entire html content, how should I write it.

Thanks

Do you just want to find the `p` element having a specific `id` attribute? Can you use other library for that task? — balderman, Sep 30 '21 at 12:32
yup, I just want to find specific ```p``` element. Which library I can use for ? Thanks — Shulin Yang, Sep 30 '21 at 12:36
I really wish there was an answer to the question of how to get an element by specific attribute value using pyquery. — Aaron Bramson, Apr 20 '22 at 03:40

score 1 · Accepted Answer · answered Sep 30 '21 at 12:55

You can use built in library ElementTree

import xml.etree.ElementTree as ET

html = '''<html><p id="hello">Hello</p><p id="world">World !!</p></html>'''
root = ET.fromstring(html)
p = root.find('.//p[@id="hello"]')
print(ET.tostring(p))

output

b'<p id="hello">Hello</p>'

use pyquery to filter html

1 Answers1