In seems PyQuery
has problem to work with this page - maybe because it is xhtml
page. Or maybe because it use namespace xmlns="http://www.w3.org/1999/xhtml"
When I use
pqPage.css('li')
then I get
[<{http://www.w3.org/1999/xhtml}html#sfFrontendHtml>]
which shows {http://www.w3.org/1999/xhtml}
in element - it is namespace
. Some modules has problem with HTML
which uses namespaces.
I have no problem to get it using Beautifulsoup
import requests
from bs4 import BeautifulSoup as BS
url = "http://www.floridaleagueofcities.com/widgets/cityofficials?CityID=101"
page = requests.get(url)
soup = BS(page.text, 'html.parser')
for item in soup.find_all('li'):
print(item.text)
EDIT: after digging in Google I found that using parser="html"
in PyQuery()
I can get li
.
import requests
from pyquery import PyQuery
url = "http://www.floridaleagueofcities.com/widgets/cityofficials?CityID=101"
page = requests.get(url)
pqPage = PyQuery(page.text, parser="html")
for item in pqPage('li p'):
print(item.text)