Questions tagged [html5lib]

html5lib is a library for parsing and serializing HTML documents and fragments in Python, with ports to Dart, PHP, and Ruby.

html5lib is an open-source HTML parser for Python, based on the HTML specification. There are ports for PHP and Ruby (both unmaintained), as well as a third-party one for Dart.

107 questions
-1
votes
1 answer

Mechanize Select First Form returns "ImportError: No module named html5lib"

After reading this tutorial, I came up with this code, import requests from bs4 import BeautifulSoup import re import mechanize import cookielib # Browser br = mechanize.Browser() # Cookie Jar cj =…
maddie
  • 1,854
  • 4
  • 30
  • 66
-1
votes
1 answer

Problems parsing a web page in python

I would like to parse a web page in order to retrieve some information about it (my exact problem is to retrieve all the items in this list : http://www.computerhope.com/vdef.htm). However, I can't figure out how to do it. A lot of tutorials on the…
clementescolano
  • 485
  • 5
  • 15
1 2 3 4 5 6 7
8